LIBRARY 

RESEARCH  REPORTS  DIVISION 

:-  I  00! 

IWONTFRcV 


NPS    52-79-004 


NAVAL  POSTGRADUATE  SCHOOL 

Monterey,  California 


On  the  Computational  Complexity 
of  Branch  and  Bound  Search  Strategies 


by 


Douglas  R.  Smith 
November  1979 


Approved  for  public  release;  distribution  unlimited, 


epared  for: 


FEDDOCS 

D  208  14/2"NPS-52-79-004  tional  Science  Foundation 

shington,  D.  C.  20550 


NAVAL  POSTGRADUATE  SCHOOL 
Monterey,  California 


Rear  Admiral  T.  F.  Dedman  Jack  R.  Borsting 

Superintendent  Provost 


This  research  was  partially  supported  by  the  National 
Science  Foundation. 


Reproduction  of  all  or  part  of  this  report  is  authorized. 


UNCLASSIFIED 


SECURITY   CLASSIFICATION   OF   THIS  PAGE  (When  Data  Entered) 


REPORT  DOCUMENTATION  PAGE 


READ  INSTRUCTIONS 
BEFORE  COMPLETING  FORM 


1.     REPORT   NUMBER 


NPS    52-79-004 


2.  GOVT   ACCESSION  NO. 


3.     RECIPIENT'S  CATALOG   NUMBER 


4.     TITLE  (and  Subtitle) 


On  The  Computational  Complexity  Of  Branch 
And  Bound  Search  Strategies 


5.     TYPE  OF  REPORT  &  PERIOD  COVERED 

Technical  Report 


6.     PERFORMING  ORG.   REPORT  NUMBER 


7.      AUTHORfs; 


Douglas   R.    Smith 


B.     CONTRACT  OR  GRANT  NUMBERf*,) 

NSF  Grant   MCS74-14445-A01 


9.     PERFORMING  ORGANIZATION    NAME   AND   ADDRESS 

Naval  Postgraduate  School 
Monterey,  CA  93940 


10      PROGRAM  ELEMENT.  PROJECT,  TASK 
AREA  &  WORK  UNIT  NUMBERS 


11.     CONTROLLING  OFFICE   NAME   AND   AODRESS 

National  Science  Foundation 
Washington,  D.  C.  20550 


12.     REPORT  DATE 

November   1979 


13.     NUMBER  OF   PAGES 
100 


14.     MONITORING   AGENCY   NAME   &    AODRESSC//  different  from   Controlling  Otlice) 


IS.     SECURITY   CLASS,   (ol  thia  report) 

Unclassified 


!Sa.     DECLASSIFI  CATION/ DOWN  GRADING 
SCHEDULE 


16.     DISTRIBUTION   STATEMENT  (ot  thia  Report) 


Approved  for  Public  Release;  distribution  unlimited 


17.     DISTRIBUTION  STATEMENT  (ot  the  abstract  entered  In  Block  20,  II  different  from  Report) 


18.     SUPPLEMENTARY   NOTES 


19.     KEY   WORDS  'Continue  on  reverse  aide  If  neceeaary  and  Identify  by  block  number) 


Combinatorial  Optimization 
Branch  and  Bound 
Complexity  of  Computation 
Search  Strategy 


Tree  Search 
Probabilistic  Modelling 


20.      ABSTRACT     Continue  on  reverse  aide  If  naceaaary  and  Identify  by  block  number) 

Many  important  problems  in  operations  research,  artificial  intelligence, 
and  other  areas  of  computer  science  seem  to  require  search  in  order  to 
find  an  optimal  solution.  A  branch  and  bound  procedure,  which  imposes  a 
tree  structure  on  the  search,  is  often  the  most  efficient  known  means  for 
solving  these  problems.  While  for  some  branch  and  bound  algorithms  a  worst 
case  complexity  bound  is  known,  the  average  case  complexity  is  usually 
unknown  despite  the  fact  that  it  gives  more  information  about  the 


DO     i   jAN   73     1473  EDITION   OF    1   NOV  65  IS  OBSOLETE 

S/N    0102-014- 6601 


UNCLASSIFIED 


SECURITY  CLASSIFICATION  OF  THIS  PAGE  (When  Data  Bntarad) 


UNCLASSIFIED 


-LLIJ^ITY   CLASSIFICATION   OF   THIS  PAGEfWhen  Data  Entered) 


performance  of  the  algorithm.  In  this  dissertation  the  branch  and  bound 
method  is  discussed  and  a  probabilistic  model  of  its  domain  is  given, 
namely  a  class  of  trees  with  an  associated  probability  measure.  The  best- 
bound-first  search  strategy  and  depth-first  search  strategy  are  discussed 
and  results  on  the  expected  time  and  space  complexity  of  these  strategies 
are  presented  and  discussed.  The  best-bound-first  search  strategy  is  showr 
to  be  optimal  in  both  time  and  space.  These  results  are  illustrated  by  dat 
from  randomly  generated  traveling  salesman  problems .  Evidence  is  presented 
which  suggests  that  the  assymetric  traveling  salesman  problem  can  be  solve 

3    9 

in  time  0(n  ln^(n))  on  the  average. 


UNCLASSIFIED 


SECURITY  CLASSIFICATION  OF  THIS  P  AGE(When  Dale  Entered) 


ABSTRACT 


Many  important  problems  in  operations  research,  artificial 
intelligence,  combinatorial  algorithms,  and  other  areas  seem  to 
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about  the  performance  of  the  algorithm.  In  this  dissertation  the 
branch  and  bound  method  is  discussed  and  a  probabilistic  model  of 
its  domain  is  given,  namely  a  class  of  trees  with  an  associated 
probability  measure.  The  best  bound  first  and  depth-first  search 
strategies  are  discussed  and  results  on  the  expected  time  and 
space  complexity  of  these  strategies  are  presented  and  compared. 
The  best-bound  search  strategy  is  shown  to  be  optimal  in  both 
time  and  space.  These  results  are  illustrated  by  data  from  ran- 
dom traveling  salesman  problems.  Evidence  is  presented  which 
suggests  that  the  assymetric  traveling  salesman  problem  can  be 
solved  exactly  in  time  0(n3ln2(n))  on  the  average. 
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Chapter  1. 
Introduction 

By  an  instance  of  a  combinatorial   problem   we   mean   the 
problem  of  finding  a  constructive  proof  of 


9x  P(x)   for  xeFS 


(1) 


where  FS  is  a  discrete  set  of  objects  and  P  is  a  predicate  de- 
fined on  FS.  That  is,  we  want  to  find  an  object  in  FS  which 
has  the  property  P.  In  many  cases  it  is  not  the  truth  but 
rather  the  feasibility  of  a  constructive  proof  of  (1)  which  is 
in  doubt.  Some  combinatorial  problems  require  all  solutions 
which  satisfy  (1).  We  restrict  ourselves  to  the  problem  of 
finding  a  single  solution,  but  note  that  all  results  obtained 
in  this  case  can  be  extended  to  handle  this  slighty  harder 
problem.  There  are  countless  examples  which  satisfy  (1)  rang- 
ing from  easy  problems  like  sorting  (find  a  permutation  of  an 
input  list  which  is  sorted),  to  more  difficult  problems  like 
integer  programming  (find  an  vector  of  integers  which  satisfies 
a  set  of  constraints)  and  theorem  proving  (find  a  proof  se- 
quence for  a  statement  in  some  language  by  means  of  a  given  set 
of  axioms  and  rules  of  inference). 

Generally  when  we  speak  of  a   combinatorial   problem,   we 
mean   a   set  C  of  related  instances  of  the  form  (1).   These  in- 
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stances  can  be  classified  according  to  their  size,  enabling  us 
to  speak  of  a  problem  instance  of  size  n.  The  question  of  what 
the  size  of  an  instance  is  and  how  to  encode  problem  instances 
can  be  tricky.  See  [Aho,  Hopcroft  and  Ullman  1974]  for  a 
disscusion  of  encodings  in  the  context  of  the  class  of  problems 
called  P  and  NP.  For  our  purposes  we  will  say  that  a  measure 
of  the  size  of  a  problem  instance  has  the  property  that  all 
problem  instances  with  the  same  size  have  the  same  feasible  set 
FS.  For  example  in  an  instance  of  the  sorting  problem,  we  are 
given  a  vector  of  n  numbers.  Here  n  is  taken  as  the  size  of 
the  instance  and  the  feasible  set  is  the  set  of  permutations  of 
n  objects.  The  predicate  P(x)  tests  whether  a  permutation  x 
applied  to  the  given  vector  results  in  a  sorted  vector.  In  an 
instance  of  an  integer  programming  problem,  we  are  given  a  set 
of  constraints  on  n  variables.  n  is  taken  as  the  instance  size 
and  the  feasible  set  is  the  set  of  all  integer  vectors  of 
length  n.  In  theorem  proving  we  are  given  a  statement  and  take 
its  length  as  the  size  of  the  instance.  Here  the  feasible  set 
is  the  set  of  all  legal  proof  sequences  in  the  theory.  If  P 
contains  an  optimization  clause  then  (1)  is  called  a 
combinatorial  optimi  zation  problem.  In  this  dissertation  we 
will  be  particularly  interested  in  combinatorial  minimization 
problems  in  which  we  seek  a  constructive  proof  of 

3x[P(x)  &Vy[P(y)  =>  f(x)<f(y)]]    for  x,yeFS        (2) 

where  f,  called  the  objective  function,  maps  FS  into  the  nonne- 
gative  reals. 
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Some  combinatorial  problems  can  be  solved  directly;  i.e. 
the  solution  is  reached  by  a  straightforward  construction  with 
no  backtracking.  When  no  direct  constructive  method  is  known, 
there  are  three  principal  search  methods  for  finding  solutions 
to  combinatorial  problems  called  enumerative  search,  local 
search  and  global  search.  In  an  enumerative  search  the  objects 
of  FS  are  produced  one  at  a  time  and  tested.  The  search  ter- 
minates the  first  time  that  P  is  satisfied.  If  we  are  seeking 
an  optimal  object  then  FS  must  be  exhaustively  searched  and  FS 
must  be  finite  in  order  to  assure  termination.  Some  problems 
such  as  that  of  finding  a  key  in  an  unordered  list  require 
enumerative  search.  Local  search  [Reiter  and  Sherman  1965;  Lin 
1965;  Weiner,  Savage,  and  Bagchi  1973;  Papadimi tr iou  and 
Steiglitz  1977]  is  usually  applied  to  combinatorial  optimiza- 
tion problems  and  is  characterized  by  a  topology  or  neighbor- 
hood structure  imposed  on  the  set  of  objects  FS.  For  any  ob- 
ject in  the  set  we  can  readily  find  all  of  its  neighbors.  A 
search  proceeds  by  selecting  some  initial  object,  picking  a 
neighbor  which  satisfies  P  and  betters  the  value  of  the  objec- 
tive function,  then  picking  a  neighbor  of  the  neighbor  and  so 
on,  until  an  object  is  found  which  is  optimal  with  respect  to 
all  of  its  neighbors.  This  object  is  called  a  local  optimum. 
If  the  neighborhood  structure  is  exact  (a  local  optimum  is  a 
global  optimum)  then  the  search  can  terminate  on  the  first  lo- 
cal optimum  found.  For  many  problems  it  is  difficult  to  find 
or  infeasible  to  use  an  exact  neighborhood  structure  and  so  a 
neighborhood  structure  with  many  local  optima  is  used.   For  ex- 
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ample  it  has  been  shown  [Weiner,  Savage  and  Bagchi  1973]  that 
in  a  exact  neighborhood  structure  for  the  traveling  salesman 
problem  on  n  cities  each  object  (a  cyclic  permutation  of  the  n 
cities)  must  have  at  least  (n-2)!/2  neighbors  therefore  render- 
ing exact  local  search  infeasible.  If  many  local  optima  exist 
then  the  best  we  can  do  is  to  restart  the  search  at  a  new  ini- 
tial object,  eventually  obtaining  a  set  of  local  optima  from 
which  the  best  may  be  picked.  These  local  search  methods  are 
analogous  to  the  descent  and  gradient  methods  of  mathematical 
programming  [Luenberger  1973].  On  complex  spaces  this  method 
is  best  suited  for  finding  approximate  solutions,  i.e.  objects 
which  are  nearly  optimal  but  not  neccesarily  optimal. 

A  global  search  is  characterized  by  the  handling  of  sets 
of  objects  rather  than  single  objects  at  a  time  as  in  local 
search.  A  powerful  form  of  global  search  may  be  described  as 
follows.  The  problem  again  is  to  find  an  object  in  a  set  FS 
which  satisfies  P.  If  such  an  object  cannot  be  found  easily 
then  we  generate  a  set  of  subproblems  by  splitting  FS  into  sub- 
sets.   The  i    subproblem  has  the  form, 

Sx  P(x)   xeFSiCFS  (3) 

where  U  FS-  =  FS.   This   process   of   creating   subproblems   by 

i 
means  of  splitting  the  feasible  set  is  repeated  until  a  solu- 
tion is  found  in  one  of  the  subsets  (which  may  not  occur  until 
the  sets  are  reduced  to  singleton  sets)  .  A  global  search  is 
the  essence  of  the  well-known  backtrack  technique  [Lehmer  1958; 
Golumb  and  Baumert  1965,  Knuth  1974]  of  which  branch  and  bound 
is  a  special  case.   Another  kind  of  global  search  which  is   re- 
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lated  to  branch  and  bound  is  the  well-known  technique  of  dynam- 
ic programming  [Bellman  1957;  Morin  and  Marsten  1976,  1978; 
Ibaraki  1978]. 

The  question  of  whether  search  is  the  most  efficient 
method  for  solving  some  combinatorial  problems  is  a  deep  one 
which  might  be  specialized  in  the  well-known  P=NP  question 
[Cook  1971;Karp  1972].  Problems  which  can  be  solved  directly 
tend  to  have  fast  algorithms  which  run  in  polynomial  time  in 
the  problem  size.  On  the  other  hand  search  algorithms  for  a 
problem  tend  to  have  a  worst-case  running  time  which  includes 
as  a  factor  the  size  of  FS,  the  feasible  set.  The  NP-complete 
problems  are  a  class  of  problems  for  which  either  all  or  none 
are  solvable  by  algorithms  which  run  in  time  given  by  a  polyno- 
mial of  the  problem  size.  Since  the  size  of  the  search  space 
FS  of  the  NP-complete  problems  is  superpolynomial  (usually  ei- 
ther exponential  or  factorial),  and  all  known  deterministic  al- 
gorithms for  NP-complete  problems  have  superpolynomial  worst- 
case  time  bounds,  one  might  conjecture  that  the  P=NP  question 
is  equivalent  to  the  question  of  whether  NP-complete  problems 
require  search  for  their  solution.  At  present  global  search 
algorithms  of  the  branch  and  bound  variety  are  the  most  effi- 
cient known  methods  for  solving  NP-hard  problems.  It  may  be 
that  in  answering  the  P=NP?  question  wholly  new  solution 
methods  will  be  found  which  obviate  the  need  for  search.  How- 
ever the  complexity  of  some  global  search  algorithms  is  our 
best  current  estimate  of  the  intrinsic  complexity  of  a  wide 
range  of  important  combinatorial  problems. 
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The  complexity  of  a  global  search  algorithm  has  usually 
been  measured  by  its  worst  case  behavior  over  all  instances  of 
a  problem,  i.e.  an  upper  bound  on  its  performance.  The  obvious 
problem  with  such  a  measure  is  that  it  gives  little  information 
about  the  usual  or  average  performance  of  the  algorithm.  For 
example  recently  [Klee  and  Minty  1970]  some  examples  have  been 
found  which  cause  the  simplex  algorithm  for  solving  linear  pro- 
grams to  run  in  exponential  time,  yet  its  usual  performance  is 
so  good  that  it  is  one  of  the  most  widely  used  computer  algo- 
rithms. It  is  especially  true  of  global  search  algorithms 
which  can  have  widely  varying  behaviors  over  the  set  of  in- 
stances of  a  problem  that  the  average  case  complexity  gives 
more  information  than  a  worst-case  measure  about  the  perfor- 
mance of  the  algorithm. 

Branch  and  bound  is  a  global  search  technique  applicable 
to  combinatorial  minimization  problems.  In  the  past  decade 
branch  and  bound  seems  to  have  emerged  as  the  principal  method 
for  solving  problems  of  this  type  which  have  no  direct  solu- 
tion. Just  a  few  of  the  applications  of  the  branch  and  bound 
method  include  integer  programming  [Garfinkel  and  Nemhauser 
1972],  flow  shop  and  job  shop  sequencing  [Ignall  and  Schrage 
1965],  traveling  salesman  problems  [Bellmore  and  Nemhauser 
1968;  Bellmore  and  Malone  1972],  heuristic  search  in  the  form 
of  the  A  algorithm  [Hart,  Nilsson,  and  Raphael  1968;  Nilsson 
1972],  and  pattern  recognition  [Kanal  1978].  The  alpha-beta 
technique  used  in  game  playing  is  an  extension  of  branch  and 
bound  to  the  game  tree  environment  [Knuth  and  Moore  1975]. 
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Branch  and  bound  algorithms  can  be  roughly  classified 
into  two  kinds  according  to  properties  of  the  trees  they  gen- 
erate. In  the  first  kind,  solutions  to  the  problem  only  occur 
at  or  below  some  fixed  depth  in  the  tree  depending  on  the  prob- 
lem size.  In  this  approach  a  solution  is  built  up  a  component 
at  a  time  until  a  complete  object  is  created.  In  the  second 
kind  of  algorithm  a  solution  may  be  found  at  any  depth  of  the 
tree  (including  the  possibility  that  the  solution  is  found  at 
the  root) .  Relaxation  procedures  fall  in  this  category.  In  a 
relaxation  procedure  a  relaxed  version  of  the  problem  is  solved 
at  each  node  of  the  tree.  In  a  relaxation  of  a  combinatorial 
problem  of  the  form  (1)  we  want  a  constructive  proof  of 

9x  P(x)   for  xeFS'  (4) 

where  FSCFS'  .  This  approach  may  be  useful  if  there  is  a  fast 
algorithm  for  solving  (4).  If  the  relaxed  solution  is  also  a 
solution  to  the  restricted  problem  (the  solution  is  in  FS) , 
then  we're  done,  otherwise  the  relaxed  solution  is  used  to 
create  subproblems  by  splitting  FS '  into  subsets  in  such  a  way 
that   the  relaxed  solution  is  precluded  from  further  considera- 


tion.   The  i    subproblem  has  the  form 


3x  P(x)   xeFS'  •  <CFS  '  . 


5) 


where  FSCU  FS'-CFS'   (c.f.  (3)).   It  is  this  second   kind   of 

i 
branch  and  bound  algorithm  which  will  be  modeled  and  studied  in 

this  dissertation.  This  is  not  to  say  that  the  results   of   the 

dissertation   do   not  apply  to  algorithms  of  the  first  kind  but 

merely  that  our  intent  was  to  study  the  second  kind. 

The  purpose  of  this  dissertation  is  to  analyze  the  branch 
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and  bound  procedure  under  several  search  strategies  in  order  to 
obtain  quantitative  estimates  of  their  expected  time  and  space 
requirements.  The  results  of  this  analysis  provide  a  framework 
for  analyzing  and  predicting  the  expected  resource  requirements 
of  specific  branch  and  bound  algorithms  as  illustrated  in 
chapter  d.  These  results  may  also  be  used  to  compare  the  rela- 
tive efficiency  of  search  strategies. 

In  chapter  2  the  branch  and  bound  algorithm  and  its 
domain  are  presented  and  several  important  properties  are 
derived.  In  Chapter  3  our  model  of  branch  and  bound  search 
trees  is  introduced  and  properties  of  the  model  trees  are 
derived.  Also  the  sense  in  which  we  will  use  the  term  complex- 
ity is  developed  and  discussed.  Chapter  4  develops  results  on 
the  complexity  of  general  search  strategies.  Chapters  5  and  6 
apply  these  results  to  the  best-bound-first  search  strategy  and 
the  depth-first  search  strategy  respectively.  Also  in  chapter 
6  the  expected  time  complexity  of  a  depth-first  search  is  stu- 
died as  a  function  of  the  depth  of  the  first  solution  found  in 
the  search  tree.  Using  the  results  obtained  in  previous 
chapters,  a  subtour  elimination   algorithm   for   the   traveling 

salesman   problem   is   modeled  in  chapter  7  and  it  is  suggested 

3   2 
that  it  has  an   expected   running   time   of   0(n  In  (n)).    The 

reader  may  wish  to  reader  chapter  7  in  parallel  with  chapters  5 

and  6  in  order  to  see  an   application   of   the   theorems   being 

developed . 
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Chapter  2. 
Branch  and  Bound  Algorithms 

Branch  and  bound  algorithms  are  designed   to   solve   com- 
binatorial minimization  problems.  We  will  denote  the  set  of  in- 
stances of  size  n  of  a  combinatorial  minimization  problem  by  C 
(FSn,COSTn)   where   FSn  is  a  countable  set  of  objects  called 

the  feasible  set  and  COST   is  a  set  of  cost  functions  such  that 
n  

any   cCCOST  maps  FS   into  nonnegative  integers.   The  parameter 

n  of  a  class  C   varies  over  positive  integers  and   is   intended 
n 

as  a  natural  measure  of  the  size  of  the  instances  of  the  prob- 
lem.  We  will  assume  that  the  cost  functions  must   satisfy   the 

condition   that   no  more  than  a  finite  number  of  objects  in  FS 

J  n 

may  have  a  given  cost.   A  problem  instance  from  C   has  the  fol- 

* 

lowing  form:  Given  cCCOST  ,  find  s  CFS   such  that  for  all  sCFS 

n  n  n 

* 
c(s  )  _<  c(s),  i.e.  find  a  least  cost   object   in   the   feasible 

set.  In  the  following  discussion  we  will  omit  the  subscript  on 
FS  and  COST  when  no  confusion  can  arise.  The  idea  of  a  branch 
and  bound  search  is  to  split  FS  into  subsets  and  to  compute  a 
lower  bound  on  the  cost  of  the  objects  within  each  subset. 
Those  subsets  whose  bound  exceeds  the  cost  of  some  known 
(perhaps  nonoptimal)  solution  can  be  discarded  since  they  can- 
not contain  an  optimal  solution.  The  remaining  subsets  are  re- 
peatedly split  and  bounded  until  an  object  is  found  whose   cost 
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does  not  exceed  the  bound  on  any  subset,  hence  that  object  is  a 
minimal  cost  solution.  The  special  power  of  the  branch  and 
bound  method  comes  from  this  ability  to  prune  away  whole  sets 
of  objects  when  they  can  be  shown  not  to  contain  an  optimal  ob- 
ject. The  choice  of  which  of  the  currently  unexamined  subsets 
to  examine  next  is  specified  by  a  search  strategy. 

We  will  use  the  example  of  the  Traveling  Salesman  Problem 
(TSP)  throughout  this  dissertation.  The  TSP  originated  with 
the  problem  of  finding  the  shortest  route  for  visiting  all  of  n 
cities  and  returning  to  the  starting  point.  We  will  deal  with 
the  following  generalization  of  the  TSP  of  size  n.  Given  a 
complete  directed  graph  with  n  nodes  and  arc  weights  given  by 
an  nxn  cost  matrix,  find  the  least  cost  hamiltonian  cycle  (a 
cycle  which  passes  once  through  each  node  of  the  graph) . 
Clearly  the  set  of  hamiltonian  cycles  on  a  complete  directed 
graph  is  isomorphic  to  the  set  of  cyclic  permutations  of  n  ob- 
jects. Here  the  feasible  set  is  the  set  of  all  hamiltonian  cy- 
cles on  a  complete  directed  graph  of  n  nodes  (or  cyclic  permu- 
tations) .  The  cost  functions  are  a  set  of  cost  matrices  which 
specify  the  arc  weights.  The  cost  of  a  hamiltonian  cycle  is 
the  sum  of  the  weights  on  the  arcs  of  the  cycle.  If  C=[c.  .] 
is  a  cost  matrix,  then  c  .  is  the  weight  on  the  directed  arc 
from  node  i  to  j .  We  do  not  require  that  c.  .  =  c-  •.  There 
is  a  long  history  of  attempts  to  devise  efficient  algorithms 
for  solving  traveling  salesman  problems.  At  present  the  most 
efficient  algorithms  for  solving  TSPs  make  use  of  a  relaxation 
procedure  embedded  in  a  branch  and  bound  algorithm.    The   Held 
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and  Karp  [Held  and  Karp  1971]  algorithm  for  solving  symmetric 
TSPs  (the  cost  matrix  is  required  to  be  symmetric)  makes  use  of 
a  relaxation  based  on  minimum  spanning  trees.  The  most  effi- 
cient known  approach  to  solving  assymetric  TSPs  makes  use  of  a 
relaxation  which  allows  all  permutations  to  be  feasible  rather 
than  just  cyclic  permutations.  This  relaxed  problem  is  known 
as  the  Assignment  Problem. 

A  branch  and  bound  algorithm  has  three  major  components. 
A  branching  rule  B  is  a  rule  determining  if  and  how  a  subset  of 
FS  is  to  be  split  into  subsets.  If  the  least  cost  object  in  a 
subset  can  be  extracted  easily  then  the  branching  rule  does  not 
split  the  subset.  Otherwise  the  subset  is  split  into  a  finite 
number  of  proper  subsets  which  then  represent  smaller  and 
therefore  easier  subproblems  to  solve.  Note  that  the  repeated 
application  of  the  branching  rule  generates  a  tree  structure  as 
in  figure  2.1.  The  branching  rule  for  a  branch  and  bound  algo- 
rithm employing  a  relaxation  procedure  is  slightly  different 
from  branching  rules  for  ordinary  branch  and  bound  algorithms. 
Given  a  set  S,  in  the  latter  case  U  B(S)  =  S,  and  in  the  former 
case  U  B(S)CS  since  we  preclude  some  of  tne  relaxed  feasible 
objects.  The  branching  rule  of  course  does  not  split  a  single- 
ton set  since  the  least  cost  object  (namely  the  only  object)  in 
the  set  can  be  easily  extracted.  It  will  be  useful  to  define 
the  function  parent  as  follows:  if  S,eB(S)  for  some  S  SFS  then 
parent  (S,  ) =S. 

The  second  component  of  a  branch  and  bound  algorithm  is  a 
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Figure  2.1.   Application  of  the  branching  rule  to  FS. 


B(FS)  =  {SlfS2/S3} 


where  SjCFS,  S2<FS,    S3*FS, 
and  S;,US2US3  sFS. 


lower  bound  function  which  maps  subsets  of  FS  into  nonnegative 
integers.  Intuitively  the  lower  bound  function  computes  a 
lower  bound  on  the  cost  of  all  objects  in  a  given  subset  of  FS. 
Formally  LB  must  satisfy  the  following  conditions: 

1.  for  SSFS  and  seS  LB(S)<c(s) 

(LB  computes  a  lower  bound  on  the  cost  of  objects  in  S), 

2.  for  S^S-SFS   LB(S-)£  LB(Si) 

(the  lower  bound  values  increase  mono tonically  on   any   path 
from  the  root  in  the  tree)  , 

3.  if  B(S)  =  S  (B  does  not  split  S)  then  LB(S)  =   c(s  ),   where 

* 

s   is  the  least  cost  object  in  S 

(when  the  least  cost  object  in  a  set  can  be   extracted,   the 
cost  of  that  object  is  the  lower  bound  value  of  the  set). 


The  lower  bound  function  is  used  to  eliminate  from  con- 
sideration those  subsets  of  FS  which  can  be  shown  not  to  con- 
tain the  least  cost  solution.  If  it  is  known  that  a  least  cost 
object  has  a  cost  of  at  most  c-,  then  any  subset  S  for  which 
LB(S)>c,  cannot  yield  the  optimal  solution. 
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The  third  component  of  a  branch  and  bound  algorithm  is  a 
search  strategy  which  is  a  rule  for  choosing  to  which  of  the 
currently  active  subsets  of  FS  the  branching  rule  should  be  ap- 
plied  .   For  conceptual  simplicity  and  uniformity  of  notation, 

a  search  strategy  will  be  realized  here  by  a  heuristic  function 

F^ 
h:2   ->PRIORITY   where   PRIORITY   is  a  set  which  depends  on  the 

particular  search  strategy.  Of  those  subsets  waiting  to  be  ex- 
plored via  the  branching  rule  we  choose  that  subset  S  with  the 
smallest  heuristic  value  h(S).  At  any  particular  time  during  a 
branch  and  bound  search,  a  certain  set  of  subsets  are  waiting 
to  have  the  branching  rule  applied  to  them.  If  the  heuristic 
value  of  these  subsets  (computed  by  h)  are  distinct  for  all 
such  times  during  a  search  of  any  problem  in  any  class  then  the 
heuristic  function  is  called  unambiguous.  Some  common  search 
strategies  will  be  discussed  below  along  with  unambiguous 
heuristic  functions  which  realize  them. 

The  branch  and  bound  algorithm  for  finding  a  single  least 
cost  object  is  given  below  in  an  ad-hoc  ALGOL-like  language. 
The  principal  data  structure  employed  is  a  priority  queue.  A 
pr  ior i ty  queue  used  here  is  a  data  structure  which  stores  data 
objects  (in  this  case  nodes  representing  subsets  of  FS )  with  an 
associated  priority  given  by  the  heuristic  function  h.  The 
queue  is  accessible  only  by  the  functions  NONEMPTY,  which  re- 
turns true  if  and  only  if  the  queue  is  nonempty,  REMOVETOP, 
which  removes  and  returns  the  data  object  in  the  queue  of 
highest  priority  (priority  i  is  higher  than  priority  j  if  and 
only  if  i£hj  for  a  suitable  definition  of  the  relation  <_,)  ,  and 
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INSERT  which  inserts  a  data  object  into  the  queue  with  its  as- 
sociated priority.  Efficient  algorithms  for  manipulating 
priority  queues  in  this  manner  are  discussed  in  [Aho,  Hopcroft, 
and  Ullman  1974].  The  procedure  BB  in  figure  2.2  is  typically 
invoked  with  the  node  representing  FS  and  co  as  arguments.  In 
later  discussion  of  BB  we  will  use  the  node  symbol  N0  to 
represent  FS.  An  obvious  improvement  of  BB  is  to  check  that 
cost  (N . ) <bound  in  statement  10  before  the  node  N.  is  inserted 
in  the  queue.  While  such  a  test  will  improve  the  performance 
of  BB  somewhat  in  practice,  we  omit  it  here  for  the  sake  of 
simplifying  our  analysis  of  the  behavior  of  BB.  Its  inclusion 
would  not  affect  our  order  of  magnitude  results  on  the  time 
complexity  of  branch  and  bound  search  but  would  have  the  effect 
of  lowering  the  space  complexity  somewhat.  Several  other 
enhancements  of  the  pruning  power  of  BB  may  be  added  to  this 
code  but  they  are  not  always  easy  to  discover  for  a  particular 
problem.  A  dominance  relation  [Kohler  and  Steiglitz  1974; 
Ibaraki  1977,  1978]  is  a  relation  on  subsets  of  FS  such  that  if 
S,  dominates  S2  then  S2  cannot  contain  a  better  solution  than 
S-,,  so  S2  can  be  eliminated.  This  test  is  a  direct  generaliza- 
tion of  the  lower  bound  test.  If  it  can  be  determined  that  two 
subsets  S,,S2.£FS  are  equivalent  in  the  sense  that  the  optimal 
solution  in  one  is  as  good  as  the  optimal  solution  in  the  oth- 
er, then  only  one  of  these  subsets  needs  to  be  explored.  This 
test  is  called  an  equivalence  test  [Ibaraki  1977,  1978]. 

One  focus  of   this   dissertation   is   on   several   common 
search   strategies  and  the  effect  they  have  on  the  average  case 
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Figure  2.2.   A  branch  and  bound  procedure. 

1.  node  procedure  BB(node  N,  integer  bound); 

begin  priority  queue  PQ; 
integer  i,k; 
node  solution; 
integer  function  REMOVETOP (prior ity  queue), 

INSERT(priority  queue , node , PRIORITY) ,  cost(node); 
PRIORITY  function  h(node); 
boolean  function  NONEMPTY (prior ity  queue); 

2.  INSERT(PQ,N,h (N) ) ;   /*  insert  root  on  queue  */ 

3.  while (  NONEMPTY  (PQ)  )  do 

4.  begin  N: =REMOVETOP (PQ) ; 

5.  if  cost(N)  <  bound 

then  begin 

6.  apply  the  branching  rule  to  N,  i.e. 

determine  the  sons  N  ,  ,  N9  ,  .  .  .  ,  N.,  of  N; 

7.  if    k    =    0  l      l  K 

then    begin    /*    better    solution    found*/ 

8.  bound  :=  cost(N); 

9.  solution  :=  N; 

end 

else  /*  store  sons  for  later  examination  */ 

10.  for  i:=l  step  1  until  k  do 

INSERT(PQ,N.,h(Ni) ) ; 
end 
end 

11.  BB:=solution 
end ; 


performance  of  a  branch  and  bound  algorithm.  The 
best-bound-f i  rst  (bbf)  search  strategy  [Lawler  and  Wood  1965; 
Fox,  Lenstra,  Rinnooy  Kan,  and  Schrage  1978]  chooses  to  apply 
the  branching  rule  to  that  subset  with  the  smallest  lower 
bound.  This  strategy  is  realized  by  the  heuristic  function 

h(S)  =  LB(S)  (1) 

wnere  LB  is  the  lower  bound  function  used  in  a  branch  and  bound 
algorithm.  The  relation  <_.  is  just  the  usual  relation  _<  on  the 
reals.  In  practice  a  priority  queue  is  indeed  the  appropriate 
data  structure  for  implementing  a  best-bound-first  search.  The 
order ed -de pth-fi rst  (odf)  search  strategy  applies  the  branching 
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rule  to  the  least  cost  of  the  most  recently  split  subsets  and 
may  be  realized  by 

h(S)  =  (d(S),LB(S))  (2a) 

where  d(S)  =  depth  of  the  subset  S  in  the  tree  generated  by  BB 
and  the  range  of  h  is  the  set  of  ordered  pairs.  This  heuristic 
function  makes  the  priority  queue  simulate  a  stack  each  element 
of  which  is  a  priority  queue,  which  is  the  way  one  would  imple- 
ment this  search  strategy  in  practice.  The 
genera t ion-order-depth-fir st  (godf)  search  strategy  applies  the 
branching  rule  to  the  first  generated  of  the  subsets  of  a  split 
set  and  can  be  realized  by 

h(S)  =  (d(S),i)  (2b) 

for  the  1  generated  set.  Again  h  produces  an  ordered  pair. 
This  heuristic  function  makes  the  priority  queue  simulate  a 
stack  whose  elements  are  queues  (or  stacks;  it  does  not 
matter).  For  both  of  these  heuristic  functions  we  define 
(a ,  b)_<h(c ,  d)  if  and  only  if  a>c  or  (a=c  and  b<d)  ,  i.e.  subset  S 
has  higher  priority  h(S)  =  (a,b)  than  subset  T  where  h(T)  = 
(c,d)  if  and  only  if  either  S  is  deeper  in  the  tree  or  S  and  T 
have  the  same  depth  but  the  lower  bound  on  S  is  less  than  the 
lower  bound  on  T.  The  ordered- breadth- first  (obf)  search  stra- 
tegy chooses  to  examine  the  least  cost  of  the  subsets  which  has 
the  smallest  depth  in  the  tree  generated  by  BB.  A  particular 
heuristic  function  realization  is 

h(S)  =  (d(S) ,LB(S))  (3a) 

as  for  depth-first  search.  In  practice  an  ordered-breadth- 
first  search  is  implemented  using  a  separate  priority  queue  for 
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each  level  of  the  search  tree  so  that  the  nodes  on  a  given  lev- 
el can  be  extracted  in  order  of  increasing  cost.  Here  we  de- 
fine (a  /b)<.,  (c,d)  if  and  only  if  a<c  or  (a=c  and  b<d)  .  The 
generation-order-breadth-first  (gobf)  search  strategy  examines 
the  subsets  a  level  at  a  time  in  the  order  of  their  generation, 
and  may  be  realized  by 


h(S)  =  (d(S),i) 
th 


(3b) 


for  the  i  generated  node  on  level  d(S).  In  practice  a 
generation-order-breadth-first  search  may  be  implemented  using 
a  single  queue  for  storing  nodes  of  the  search  tree.  Note  that 
according  to  the  realizations  given,  both  ordered-depth-f irst 
search  and  ordered-breadth-f irst  search  have  local  best-bound 
search  components.  E.g.  in  a  breadth-first  search  a  best-bound 
search  is  performed  on  the  set  of  nodes  that  appear  at  a  given 
depth.  Figure  2.3  gives  an  example  of  a  tree  and  the  order  in 
which  each  of  the  above  search  strategies  examines  the  tree  is 
given. 


As  an  example  of  a  branch  and  bound  algorithm  we  will 
consider  a  subtour-el imination  algorithm  for  solving  traveling 
salesman  problems.  Subtour-el imination  algorithms  make  use  of 
a  relaxation  of  the  traveling  salesman  problem  called  the  as- 
signment problem  (AP)  which  can  be  easily  solved.  The  assign- 
ment problem  comes  from  the  problem  of  assigning  n  men  to  n 
jobs  in  a  way  which  minimizes  the  cost  of  the  assignment.  We 
are  given  an  nxn  matrix  [c,.]  where  c.-  is  the  cost  of  assign- 
ing man  i  to  job  j.  The  cost  of  an  assignment  is  the  sum  of 
the   costs   of  assigning  each  man  to  his  job.   The  feasible  set 
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Figure  2.3  A  search  tree  and  the  order  in  which  the  search  stra- 
tegies realized  by  (1),  (2a),  (2b),  (3a),  and  (3b)  examine  the 
nodes.  A  leaf  is  denoted  by  putting  a  star  under  a  node.  Lines 
under  a  node  mean  that  if  necessary  the  node  could  be  split 
further . 


bbf   -  <0,  10,  15,  19,  20,  21,  30> 

godf  -  <0,  20,  25,  24,  15,  10,  19,  21,  30> 

odf   -  <0,  10,  19,  21,  15,  20,  30> 

gobf  -  <0,  20,  15,  10,  30,  25,  24,  19,  21> 

obf   -  <0,  10,  15,  20,  30,  19,  21> 


of  an  instance  of  the  assignment  problem  can  be  formulated  as  a 
permutation  of  n  objects.   For  example  given  the  cost  matrix 
m*"\   I    z        3 


I  I  1  I  3  I  7 


21  4  1  2  |  5  1 
31  3  I  2  |  6  1 

the  optimal  assignment  of  men  to  jobs  is:  man  1  to  job  1,  man  2 
to  job  3,  and  man  3  to  job  2,  or  (1) (2  3)  in  cycle  notation. 
On  the  same  cost  matrix  the  optimal  traveling  salesman  tour  is 
(1  2  3) .  The  assignment  problem  is  a  relaxation  of  the  travel- 
ing salesman  problem  since  a  solution  to  the  former  is  a  permu- 
tation (composed  of  one  or  more  cycles)  whereas  a  solution  to 
the  latter  must  be  a  cyclic  permutation  (a  permutation  with 
just   one  cycle).   The  assignment  problem  is  solveable  in  0(n  ) 
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time  for  an  initial  problem  and  0(n  )  for  subsequent  modified 
versions  of  the  initial  problem  [Bellmore  and  Malone  1971; 
Lawler  1976].  Subtour-el iminat ion  algorithms  differ  mainly  in 
their  choice  of  branching  rule.  The  following  branching  rule 
was  proposed  by  Shapiro  [Shapiro  1966]: 

Given  cost  matrix  C,  solve  the  assignment  problem  with 
respect  to  C.  If  the  least  cost  solution,  it,  is  cyclic,  then  we 
have  extracted  the  least  cost  cyclic  permutation  over  the 
feasible  set  of  C,  so  there  is  no  need  to  branch.  If  w  is  non- 
cyclic  then  pick  one  of  its  subcycles,  say  the  smallest,  and 
let  this  cycle  be  denoted  (  i  ,  ,  i  ~  ,  .  .  .  ,  i .  )  .  In  the  optimal  cost 
cyclic  permutation,  at  least  one  of  the  nodes  in  this  cycle 
must  be  directed  outside  the  cycle  since  the  subcycle  cannot  be 
a  part  of  a  cyclic  permutation.  The  feasible  set  is  split  as 
follows:  In  the  j    subset  we  force  the  node  i-  to  connect  to  a 


node  not  in  the  cycle  ( i 1 , i 0,  .  .  .  , i  .  )  by  setting  the  matrix   en- 


l'x2 


tries 


c .   .   =  c  • 
lj'xl     lj'x2 


ci.,i,  =  OD- 
1       k 


The  lower  bound  function  is  simply  the  cost  of  the  as- 
signment problem  solution.  It  is  easily  shown  that  this  is  a 
lower  bound  function.  Condition  1  for  a  lower  bound  function 
(the  lower  bound  function  yields  a  lower  bound  on  the  cost  of 
all  objects  in  a  given  set),  is  satisfied  since  the  assignment 
solution  is  by  definition  a  lower  bound  on  the  cost  of  all  per- 
mutations feasible  with  respect  the  cost  matrix.  Condition  2 
(if   S.$S-   then  the  lower  bound  on  S-  is  <_  the  lower  bound  on 
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S,)  is  satisfied  because  the  least  cost  permutation  in  a  set  S 
will  have  at  least  as  small  a  cost  as  the  least  cost  permuta- 
tion in  any  subset  of  S.  Condition  3  (if  B(S)={S}  then 
LB(S)=c(s  )  where  s  is  the  least  cost  object  in  S)  is  satis- 
fied since  a  set  S  is  not  split  if  the  assignment  solution  is 
also  a  traveling  salesman  solution.  In  this  case  the  lower 
bound  is  just  the  cost  of  the  least  cost  feasible  object  (a  cy- 
clic permutation)  in  S.  There  are  a  number  of  variations  on 
the  branching  rule  given  in  [Bellmore  and  Nemhauser  1971;  Gar- 
finkel  1973;  Smith,  Srinivasan,  and  Thompson  1977]. 

In  all  published  versions  the  smallest  subcycle  of  the 
assignment  solution  is  chosen  to  guide  the  set  splitting.  This 
is  justifiable  on  the  general  principle  of  tree  searching  that 
if  possible  it  is  wise  to  arrange  the  tree  such  that  smaller 
branching  factors  are  near  the  top  of  the  tree  and  larger 
branching  factors  are  deeper  in  the  tree.  The  reason  behind 
this  principle  is  that  if  a  node  is  pruned  near  the  top  of  such 
a  tree,  there  is  a  relatively  larger  reduction  in  the  size  of 
the  feasible  space  due  to  the  fact  that  the  feasible  space  has 
been  split  fewer  times  by  the  smaller  branching  factor 
[Reingold,  Neivergelt  and  Deo  1977,  pages  111-112].  The  choice 
of  the  smallest  subcycle  is  good  from  another  point  of  view. 
Bellmore  and  Malone  [Bellmore  and  Malone  1971]  have  shown  that 
this  choice  maximizes  the  reduction  in  the  the  number  of  feasi- 
ble noncyclic  solutions. 

The  subtour-el imination  approach  first  appeared  in  [East- 
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man  1957]  and  was  subsequently  developed  by  [Shapiro  1966; 
Bellmore  and  Malone  1971;  Smith,  Srinivasan,  and  Thompson 
1977]  . 

Some  Properties  of  Branch  and  Bound  Algorithms 

Efforts  have  been  made  to  devise  a  formalism  general 
enough  to  cover  the  diverse  applications  of  the  branch  and 
bound  procedure  [Lawler  and  Wood  1966;  Mitten  1970;  Rinnooy  Kan 
1974,  1976;  Kohler  and  Steiglitz  1974;  Ibaraki  1976,  1978]. 
These  formalisms  have  been  used  to  prove  correctness  and  termi- 
nation properties  and  also  to  investigate  theoretically  the  ef- 
fects of  various  choices  of  parameters  on  performance. 
Although  the  theorems  in  this  section  are  not  essentially  new 
the  proofs  are  new  in  order  to  cover  our  different  definitions 
and  assumptions. 

The  first  proposition  allows  us  to  assume  that  the 
heuristic  realizations  of  the  search  strategies  considered 
above  are  unambiguous. 

Proposition  2.1:  The  best-bound-first,  depth-first  (both  or- 
dered and  generation-order),  and  breadth-first  (both  ordered 
and  generation-order)  search  strategies  can  be  realized  by 
unambiguous  heuristic  functions. 

Proof:  We  can  show  that  (1),  {2a),  and  (3a)  are  unambigu- 
ous if  the  lower  bound  function  LB  satisfies  the  following  con- 
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dition:  for  any  S,  T  cFS  if  S^T  then  LB(S)^LB(T).  If  for  a 
particular  problem  and  branch  and  bound  algorithm  LB  does  not 
satisfy  this  condition  then  a  new  lower  bound  function  can  be 
constructed  at  runtime  as  follows:  LB'(S)  =  (LB(S),k)  where  S 
is  the  kth  distinct  subset  of  FS  with  cost  LB(S)  examined  to 
the  current  point  in  the  search.  With  regards  to  the  relation 
_<  used  in  line  5  of  figure  2.2,  define  (a,b)<(c,d)  if  and  only 
if  a<c  or  (a=c  and  b<d) .  Clearly  LB'  generates  distinct  bounds 
on  distinct  subsets.  The  heuristic  function  for  best-bound- 
first  search  (1)  h(S)=LB'(S)  is  unambiguous  since  all  values  of 
h  for  distinct  subsets  are  distinct.  The  same  reasoning  holds 
for  (2)  and  (3)  since  the  range  of  h  incorporates  the  lower 
bound  LB ' . 

Let  us  now  consider  generation-order-depth-first  search. 
Suppose  there  is  a  tree  such  that  at  some  time  during  a 
generation-order-depth-first  search  there  are  distinct  subsets 
S,  and  S2  in  memory  with  the  same  heuristic  value  hCS^)  = 
h(S2).  Thus  d(S,)  =  d(S2)  and  ig  =  i„  where  d(S)  =  depth  of 
S  and  is  is  the  generation  number  of  S.  S-,  and  S2  clearly  can- 
not have  the  same  parent  set  otherwise  they  would  not  have  the 
same  generation  number.  Suppose  now  without  loss  of  generality 
that  SI  was  generated  prior  to  S2.  Let  P2  denote  the  parent  of 
S2.  We  have  d(P2)  =  d(S2)-l.  Since  SI  is  generated  prior  to 
S2  and  there  is  a  time  when  both  SI  and  S2  are  in  memory  simul- 
taneously, SI  must  be  on  the  queue  when  P2  is  split  to  form  S2 
and  its  sibling  sets.  But  P2  cannot  be  chosen  over  SI  for 
branching   because  d(P2)<d(Sl)  and  therefore  h(P2)>h(Sl).   This 
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contradiction  shows  that  there  cannot  be  two  distinct  sets  on 
the  queue  with  the  same  heuristic  value  under  generation- 
order-depth-first  search.  A  gobf  is  unambiguous  because  no  two 
sets  on  a  given  level  can  be  assigned  the  same  generation 
number,  so  all  assigned  priorities  are  distinct.   QED 


The  following  sequence  of  definitions  together  with  pro- 
position 2.1  and  lemma  2.1  lead  up  to  a  theorem  regarding  the 
conditions  under  which  BB  will  find  an  optimal  solution  to  an 
instance  of  a  combinatorial  minimization  problem.  Let  A  denote 
the  sequence  of  nodes  examined  by  BB'tN^oo)  where  BB  *  is  BB 
with  the  test  of  statement  5  replaced  by  the  value  TRUE.  Thus 
A  is  the  order  in  which  BB  examines  the  entire  tree  if  no  cu- 
toffs are  performed.  The  possibility  that  the  tree  is  infinite 
shall  present  no  difficulties  in  defining  A  because  we  are  only 
interested  in  the  relative  ordering  of  the  nodes.  In  a  similar 
way  define  A  to  be  the  sequence  of  nodes  examined  by  BB(N,,,b)  , 
i.e.  BB  given  an  initial  bound  of  b.  Note  that  A  is  not  neces- 
sarily the  same  as  A00  since  cutoffs  are  made  in  the  latter  se- 
quence. Lastly  define  B  to  be  a  sequence  of  values  (bounds) 
associated  with  the  nodes  of  A  as  follows:  for  each  i>0,  if 
node  A.  is  in  A   (i.e.  A.  is  examined  by  BB(N^,b)  ),  then  B.  is 

the  value  of  the  variable  bound  in  statement  5  at  the  time  that 

b  „b     „         „b 


A.  is  examined  by  BB(Na,b).   Otherwise  B7=BT      Note  that  B, 
l  1  ^    0 '  li-l  0 

b.  As  an  example  of  these  definitions  consider  the  action  of 
BB  under  a  generation-order-depth-first  search  strategy  on  the 
tree  of  figure  2.3,  we  find 
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..,  15,  30,  35> 

15,  30  > 

..,  19,  15,  15> 

15,  30  > 

..,  19,  15,  15> 


Proposition  2.2   For  all  b,  A   is  a  subsequence  of  A. 


Proof:  Since  A  contains  a  subset  of  A  it  remains  to  be 
shown  that  the  order  of  nodes  in  A  is  preserved  in  A  .  Assume 
the  contrary.  Let  s  be  the  first  element  of  A  which  is  out  of 
order  with  respect  to  the  ordering  of  A;  so  A=< . . .s , . . . , r  , . . . > 
A  =<  .  .  .r  ,s , . . . >.  There  are  two  cases  to  consider.  Case  1:  At 
the  time  that  s  is  examined  in  A,  r  is  in  the  queue.  By  our 
assumption  that  the  heuristic  functions  are  unambiguous 
h(s)<h(r) .  But  in  A  if  r  and  s  are  in  the  queue  when  r  is  ex- 
amined then  h(r)<h(s).  Or  if  s  is  not  in  the  queue  when  r  is 
examined  it  is  because  r  is  the  parent  of  s.  Thus  r  must  be 
examined  before  s  in  A.  Either  way  we  have  a  contradiction. 
Case  2:  At  the  time  that  s  is  examined  in  A,  an  ancestor  r  of  r 
is  in  the  queue  (see  figure  2.4).  r  must  be  examined  sometime 
between  the  time  that  s  is  examined  and  the  time  that  r  is  ex- 
amined. Again  by  the  restriction  on  the  heuristic  functions 
h(s)<h(r).  Since  h(s)<h(r)  it  must  be  that  s  is  not  in  the 
queue  when  r   is  examined  in  A  .  Thus  an  ancestor,  s,  of  s  must 

be  in  the  queue  with  r  and  h(r)<h(s).   This  implies  that  s  fol- 

b 
lows  r  in  A  .   On  the  other  hand,  s  must  precede   s   and   since 
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Figure  2.4.   A  and  A   in  case  2  of  proposition  2.2.   The  inferred 

b 
state   of   the   priority  queue  is  given  under  s  in  A  and  r  in  A  . 

The  topmost  entries  have  the  highest  priority. 


s 


r 


s 

I*  *  "J 


h(s)<h(r)  r  must  follow  s  in  A,  thus  r  follows   s   in   A. 


But 


this  means  that  r  is  out  of  order  contradicting  our  assumption 
that  s  is  the  first  node  in  A  out  of  order.   QED 

The  following  lemma  asserts  that  exactly  those  nodes  are 
examined  in  a  branch  and  bound  search  wih  finite  initial  bound 
whose  parents  have  a  cost  within  che  current  value  (at  examina- 
tion time  of  the  parent)  of  the  bound. 

Lemma  2.1:  For  any  node  M  in  A,  except  the  initial  node  N.,,  and 

for  any  finite  integer  b,  M  is  examined  by  BB(N„,b)  if  and  only 

if  LB(M')<B.     ,,.,,,  where  M'=parent(M)  and   index(N)   is   the 
index (M1 )  r 

index  in  A  of  node  N.   LB  is  the  lower  bound  function. 

proof:  If  part:  We  will  show  by  contradiction  that  all  nodes  in 
A  except  N,,  satisfy  this  half  of  the  lemma.  Let  M  be  the  first 
node  in  A  which  is  not  examined  by  BB(N~,b) ,  and  whose   parent, 

M*,  has  cost  cost (M* ) <Bindex (M1 ) *  We  can  show  tnat  M'  is  exam- 
ined by  BB(N  ,b)  as  follows:  if  M*=N  then  we  already  know  that 
M '  is  examined  since  the  initial  node  is  always  examined  by  BB. 
Otherwise    let     M"=par ent (M1 ) .  We     can     show     that 
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LB(M" )<Bindex(M») .   We  have  LB (MM ) <LB (M' )  by  condition  2  of  the 

b         ^b 
definition  of  a  lower  bound  function,  and  Bin(jex  (M") —index  (M*  ) 

since  M"  comes  before  M'  in  the  node  ordering  of  A  and  the  se- 
quence    B      is     nonincreasing .       Thus    we    have 

LB(M" ) <LB(M' )<Bb  ,   ,Ml,<  Bb  ,   ,M»,.   Since   we   have   assumed 
v  '   index(M')—   index(M") 

that    M    is    the   first   unexamined   node   in   A  such   that 

LB(M')<Bb     /mix/  it  must  be  that  M'  is  examined  by   BB(Nfl,b) . 
v   '   index (M* )  *  0' 

When  M'  is  examined  we  have  LB (M* ) <B^ndex (Mi \ /  i.e./  the  lower 
bound  of  M'  is  within  the  current  value  of  the  program  variable 
bound,  thus  the  test  of  line  5  in  Figure  2.2  evaluates  to  true 
and  the  sons  of  M'  (including  M)  are  inserted  in  the  priority 
queue  for  later  examination.  For  any  finite  initial  bound  b, 
there  are  only  a  finite  number  of  nodes  with  a  lower  bound  less 
than  b  and  since  any  node  can  insert  a  finite  number  of  sons  in 
the  queue,  there  are  only  a  finite  number  of  nodes  inserted  in 
the  queue  during  the  execution  of  BB(N„,b).  This  means  that 
after  a  finite  time  the  node  M  must  be  examined.  This  state- 
ment contradicts  our  assumption  that  M  is  not  examined  by  BB, 
so  our  assumption  that  there  is  a  node  that  is  not  examined 
under  the  condition  of  the  lemma  must  be  false. 

For  the  only  if  part:  if  M  is  examined  by  BB(N„,b)  then 
by  definition  M  was  inserted  in  the  priority  queue  at  some 
time.  This  could  only  be  if,  when  the  parent  of  M,  M1 ,  was  ex- 
amined, the  test  of  line  5  in  Figure  2.2  evaluated  to  true, 
i.e.,  LB(M')<bound=8bndex(Mt).    QED 


It  can  now  be  shown  under  what  conditions  BB   will   solve 
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an  instance  of  a  combinatorial  minimization  problem. 


Theorem  2.1:   For  finite  b>c   where  c   is  the  cost  of   a   least 

cost   object   in   FS,  BB(N0,b)  will  find  an  optimal  solution  of 

* 
cost  c   in  a  finite  amount  of  time. 


* 
If  c   is  the  cost  of  a  least  cost  object  in  FS   then   for 

b   * 
all   nodes   in  A,  Bt>_c   where  i  ranges  over  the  nodes  of  A  (  we 

b    * 
have  B0=>c  ,  and  there  is  no  object  in  FS  which  could  lower  the 

*  * 

bound   below   c  ) .    Let   s    be   the   first   node   in   A  which 

represents  a  subset  of  FS  from  which   the   branching   rule   ex- 

*  * 

tracts  a  least  cost  object  of  cost  c  .   The  parent  of  s   clear- 

*  * 

ly  has  a  lower  bound<c   so  by  lemma   2.1,   s    is   examined   by 

BB(N0,b).  Thus  BB(N0/b)  finds  an  optimal  solution.  Again 
since  only  a  finite  number  of  nodes  are  involved  when  b  is  fin- 
ite BB  terminates  in  a  finite  amount  of  time.   QED 


We  conclude  this  chapter  with  the  following  theorem  which 

asserts  that  it  is  worthwhile  to  find  as  tight  an  initial  bound 

as  possible  on  the  cost  of  the  optimal   solution  in   order   to 

minimize  the  amount  of  work  necessary  to  find  the  optimal  solu- 
tion. 

Theorem  2.2:  For   a   particular   instance   of   a   combinatorial 

minimization   problem  p=(FS,cost)  where  costCCOST   in  class  C  , 

n  n 

let  ET(h,p,b)  be  the  number  of  subsets  examined  by  BB(N0,b)  us- 
ing  the  search  strategy  realized  by  h.   For  all  h  and  all  b,b* 
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such  that  b'>b>0,  ET (h,p , b)<ET  (h, p , b ■ ) . 

Proof:  The  theorem  follows  easily  from  the  following  lem- 
ma. 

Lemma  2.2:   For  all  i  and  b'>b,  B7<B?  . 

b        b' 
Proof  by  induction  on  i.   For  i=0,  B0=b  and  B0  =b*  so  the 

lemma   holds.    Next  assume  B?  . <B7  ,  for  some  i>l.   B.  will  be 

l-l—  l-l  i 

changed  from  the  value  B.  ,  only  if  BB(N0,b)  examines  A.  ,   and 

b    b ' 
it   represents  a  solution  which  changes  the  bound.   If  B.  >  B. 

I*  I  u. 

it  must  be  because  B.  =  cost(A._,)  but  B.  remains  unchanged. 
Let  A.  be  the  parent  of  A.,.  Ai_i  is  unexamined  in  BB(N0,b) 
if  cost(A.)>  B.  by  lemma  2.1.  But  since  A._,  is  examined  by 
BB(N  ,b')  we  have  cost(A.)<Bb  ,  thus  Bb  <Bl?  which  contradicts 
our  inductive  assertion.   So  B;<B.  . 

Let  A.  be  a  subset  examined  by  BB(N0,b),   Let   A-   denote 

the  parent  of  Ai .   By  lemma  2.1,  cost(A-)<B-  and  B-<_B.   by  lem- 

b' 
ma  2.2,  so  cost(A.)<B!r  .   Thus  A.  is  also  examined  by  BB(N0/b') 

according   to   lemma  2.1.   If  every  subset  examined  by  BB(N0,b) 

is  also  examined  by  BB (N  ,b')  then  ET  (h , p ,b) <ET  (h  ,p  ,  b * )  .  QED 


-28- 


Chapter  3. 

Computational  Complexity 

and  a  Model  of  Branch  and  Bound  Search  Trees 

3.1  Random  Problems  and  Random  Trees 

In  the  previous  section  it  was  noted  that  the  branch  and 
bound  process  generates  a  tree  structure.  In  this  chapter  we 
use  this  abstraction  to  define  a  probabilistic  class  of  trees 
which  models  the  kind  of  tree  structures  that  BB  generates  over 
the  instances  of  a  combinatorial  minimization  problem.  Within 
this  model  then  it  makes  sense  to  derive  expressions  for  the 
expected  time  and  space  requirments  of  BB  under  various  search 
strategies.  The  set  of  subsets  of  FS  that  are  inserted  in  the 
priority  queue  during  the  execution  of  BB  is  called  the  search 
tree  and  the  time  complexity  of  a  branch  and  bound  search  will 
be  measured  by  the  size  of  the  search  tree.  The  space 
complexity  will  be  measured  by  the  maximum  number  of  subsets  in 
the  queue  at  any  time  during  the  search.  The  time  and  space 
complexities  of  a  given  search  by  BB  will  sometimes  be  denoted 
by  the  variables  N  and  Ns  respectively.  This  definition  of 
time  complexity  does  not  include  the  amount  of  time  spent  exe- 
cuting the  branching  rule  or  inserting  nodes  in  the  queue.  A 
branching  rule  is  a  feature  of  a  particular  algorithm  and  lit- 
tle can  be  said  about  it  on  the  level  of  abstraction  aimed   for 
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in  this  dissertation.  We  assume  that  these  factors  are  rela- 
tively independent  of  the  rest  of  the  branch  and  bound  process 
so  that  the  product  of  the  average  branching  time  per  subset, 
the  average  time  spent  inserting  a  node  in  the  queue,  and  the 
total  number  of  nodes  inserted  in  the  queue  (the  time  complexi- 
ty) is  a  reasonable  approximation  to  the  running  time  of  a  par- 
ticular problem  on  a  machine.  Again  though  our  goal  is  to  com- 
pare the  expected  performance  of  various  search  strategies  on  a 
problem.  On  a  given  problem  the  branching  time  and  node  inser- 
tion time  should  factor  out  of  this  comparison  leaving  the  size 
of  the  search  tree  as  the  essential  measure  of  performance. 

The  question  of  interest  is  how  can  we  model  the  behavior 
of  BB  on  a  random  instance  of  a  problem  apart  from  the  details 
of  the  problem.  I.e,  what  features  of  a  branch  and  bound 
search  are  relevant  to  branch  and  bound  and  what  are  problem 
dependent?  First  by  the  action  of  the  branching  rule  a  tree 
structure  is  generated,  so  BB  is  a  tree  searching  algorithm. 
Secondly  the  lower  bound  function  of  BB  associates  a  number 
with  each  node  in  this  tree.  The  search  strategy  does  not  af- 
fect the  tree  per  se ,  but  only  the  order  in  which  the  algorithm 
examines  the  tree.  So  a  tree  with  costs  associated  with  each 
node  is  another  way  of  expressing  the  domain  of  BB.  In  this 
setting  the  goal  of  BB  is  to  find  the  least  cost  leaf  of  the 
tree.  These  considerations  are  formalized  in  the  following  de- 
finition. An  arc-labelled  tree  is  a  tree  T=(N,A,C)  where  N  is 
a  set  of  nodes,  A  is  a  set  of  arcs,  and  C:A->Z  (positive  in- 
tegers)  is  a  cost  function  on  the  arcs  of  the  tree.   For  exam- 
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Figure  3.1.   An  arc-labelled  tree 


pie  see  figure  3.1.  In  an  arc-labelled  tree  the  cost  of  a  node 
is  defined  to  be  the  sum  of  the  costs  on  the  arcs  on  the  path 
from  the  root  to  the  node.   The  cost  of  the  root  is  zero. 

The  next  step  is  to  map  the  notion  of  a  random  instance 
of  given  size  into  the  arc-labelled  tree  domain.  A  probability 
function  is  assigned  to  the  class  of  arc-labelled  trees  which 
should  somehow  correspond  with  a  probability  distribution  on  a 
combinatorial  minimization  problem.  Our  model  of  this  mapping 
is  to  regard  the  generation  of  a  tree  as  a  random  process  in 
which  each  application  of  the  branching  rule  is  replaced  by  an 
independent  random  experiment  where  the  outcome  is  the  number 
of  sons  that  a  node  has.  In  a  similar  manner  the  assignment  of 
a  cost  to  a  node  is  treated  as  the  outcome  of  a  different  in- 
dependent random  experiment.  Formally  let  P  and  Q  be  probabil- 
ity mass  functions.  It  is  assumed  that  P  and  Q  satisfy  the 
following  properties: 

1.  P(0)  >  0    (a  node  is  terminal  with  nonzero  probability) 

2.  Q(0)  =  0    (an  arc  has  cost  zero  with  probability  zero). 

The  algorithm  in  figure  3.2  generates  a  random  arc-labelled 
tree.    Let  RANDOM(F)  be  a  random  function  which  returns  k  with 
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probability  F(k).  This  dynamic  means  of  defining  a  random 
arc-labelled  tree  is  easily  implemented  for  experimental  pur- 
poses on  a  machine. 

This  process  is  related  to  the  well-known  branching  pro- 
cess [Harris  1963]  which  has  applications  to  population  growth, 
nuclear  fission  reactions,  and  particle  cascades.  The  basic 
branching  process  is  essentially  the  same  as  the  process  in 
figure  3.2  except  that  sprouting  may  be  done  in  parallel  and 
there  is  no  arc-labelling.  The  initial  node  in  a  branching 
process  is  viewed  as  an  individual  who  gives  birth  to  k  indivi- 
duals with  probability  P(k),  who  in  turn  give  birth  to  new  in- 
dividuals, and  so  on.  The  number  of  nodes  at  depth  d  in  the 
generated  tree  is  the  random  variable  of  interest  and  it  is  in- 
terpreted as  the  size  of  the  population  at  time  d.  The  theory 
of  branching  processes  is  concerned  with  the  distribution  and 
moments  of  the  population  size  as  a  function  of  time,  the  pro- 
bability of  extinction  (i.e.,  whether  the  tree  is  finite  or  in- 
finite) ,  and  the  behavior  of  the  process  in  the  case  that  the 
population  does  not  die  out.  In  contrast  our  concern  here  is 
with  the  behavior  of  the  algorithm  BB  on  a  randomly  generated 
tree.  In  general  only  a  small  finite  portion  of  the  tree  will 
be  searched  by  BB. 

We  will  need  to  define  a  probability  function  on  the  set 
of  arc-labelled  trees.  This  can  be  accomplished  as  follows. 
The  generation  of  a  tree  is  viewed  as  a  sequence  of  trials, 
where  each  execution  of  step  2  in  figure  3.2  is  a  trial.   Let  n 
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Figure  3.2.  Generating  a  random  arc-labelled  tree. 

1.  Let  a  root  node  exist.  The  root  is  unsprouted. 

2.  Select  an  unsprouted  node  n  (according  to   some   search   stra- 
tegy) and  sprout  it  as  follows: 

Let  n  have  RANDOM(P)  sons.  For  each  arc   from   n   to   its 
sons  label  the  arc  with  cost  RANDOM(Q). 

3.  Repeat  step  2  until  all  nodes  have  been  sprouted. 

denote  the  number  of  sons  generated  in  a  random  trial  and  let 
c,,c2,...fC  denote  the  arc  costs  assigned  to  the  arcs.  The 
probability  of  the  outcome  of  a  trial  then  is 
P (n) Q (c, )Q (c2) . . .Q (c  ) .  Clearly  if  we  sum  over  all  possible 
outcomes  of  a  trial,  the  probabilities  sum  to  1, 


■CO 


CO 


CO 


i  P(n)  l      Q(c1) . . .    2    =1. 

n=0     c,  =1  c  =1 

l  n 

We  can  formulate  the  probability  of  a  tree  generated  by  this 
process  as  follows.  Consider  the  probabilities  of  the  outcomes 
of  the  trials  during  the  generation  of  a  tree  in  a  sequence 
<g  ,g  ,  g  ,...  >,  where  g.  is  the  probability  of  the  particular 
outcome  of  the  i  trial.  Let  us  call  the  product  g0g,...g. 
the  i  partial  probability  of  the  randomly  generated  tree. 
The  probability  of  a  randomly  generated  tree  then  is  the  limit 
as  i  goes  to  infinity  of  the  i  partial  probability.  For  ex- 
ample, the  probability  of  the  arc-labelled  tree  in  Figure  3.1 
is  P(2)*Q  (1)*Q  (2)*P (0)*P (3) *Q (3) *Q (5)*Q (7) *P (0) *P (0) *P (0)  .  It 
is  our  special  assumption  that  each  trial  is  independent  of  all 
other   trials  that  enables  us  to  take  the  product  of  the  proba- 
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bilities  of  the  individual  trials  as  the  probability  of  the 
tree.  A  more  formal  approach  to  this  probability  function  over 
the  set  of  arc-labelled  trees  can  be  based  on  the  measure- 
theoretic  treatment  of  trees  generated  by  a  branching  process 
found  in  [Mode  1971,  pg .  3-6]. 

If  the  tree  generated  by  our  process  is  infinite  then  the 
limit  of  the  partial  probabilities  will  usually  go  to  zero 
since  we  are  considering  the  product  of  numbers  between  0  and 
1.  Although  the  probability  of  generating  a  particular  infin- 
ite tree  is  usually  zero,  it  can  be  shown  that  the  probability 
that  a  randomly  generated  tree  is  infinite  is  nonzero  for  many 
probability  functions  P.  This  fact  is  a  basic  result  of  the 
theory  of  branching  processes  [Feller  1963,  pg .  7]  and  may  be 
stated  more  precisely  as  follows:  Let  F  denote  the  mean  of  P. 
If  F<J  then  a  randomly  generated  tree  is  finite  with  probabili- 
ty 1  .  If  P>1  then  a  randomly  generated  tree  is  infinite  with 
probability  §,  where  §  is  the  least  positive  fixed  point  of  the 
generating  function  for  P:  §=p(§)  where 


CO 

p(s)  =  2   P(k)s 

k  =  0 


k 


A  randomly  generated  tree  is  finite  with  probability  1-§.  By 
an  infinite  tree  we  mean  not  only  a  tree  with  unbounded  depth 
but  also  that  the  number  of  nodes  at  depth  d  grows  unboundedly 
in  d.  It  turns  out  that  the  probability  that  a  random  tree  has 
unbounded  depth  but  a  bounded  nonzero  number  of  nodes  on  all 
levels  is  zero  for  all  P. 
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Let  sons(N)  denote  the  number  of  sons  of  the  node  N.  The 
arc-labelled  tree  T  is  in  the  class  of  (PfQ)-trees  if  and  only 
if  P(sons(N))>0  for  all  neN  and  Q(C(a))>0  for  all  aeA.  E.g., 
if  a  node  in  T  has  11  sons  but  P(11)=0  then  T  is  not  a 
(P,Q)-tree. 


The  remainder  of  this  dissertation  is  concerned  with  the 
expected  performance  of  BB  on  the  class  of  (PfQ)-trees  for  ar- 
bitrary P  and  Q.  All  theorems  about  (P,Q)-trees  are  implicitly 
quantified  over  all  probability  mass  functions  P  and  Q  though 
it  may  not  be  stated.  It  is  important  to  ask  how  successful 
this  transfer  is  of  the  notion  of  random  problem  instance  to  a 
random  (P,Q)-tree.  Can  we  predict  (through  careful  choice  of  P 
and  Q)  the  expected  performance  of  BB  on  a  combinatorial  minim- 
ization problem  by  finding  the  expected  performance  of  BB  on 
the  class  of  (P,Q)-trees? 

The  key  asumption  in  this  model  is  the  independence  of 
each  application  of  the  branching  rule  and  the  independence  of 
each  assignment  of  arc  costs.  It  might  be  expected  however 
that  the  degree  of  a  node  depends  somewhat  on  the  depth  in  the 
tree.  In  particular  for  finite  trees  the  probability  that  a 
node  has  zero  sons  should  go  to  one  with  increasing  depth. 
That  tnese  observations  are  so  for  the  traveling  salesman  prob- 
lem is  borne  out  by  table  2  in  chapter  7. 

In  defense  of  this  model  it  may  be  noted  that  this  is 
perhaps  the  simplest  possible  model  of  branch  and  bound  trees 
and  should  prove  more  amenable  to  analysis  than  a  more   complex 
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model.  With  regards  to  the  independence  assumption,  for  suffi- 
ciently large  trees  the  branch  and  bound  process  examines  only 
the  topmost  part  of  the  full  tree  which  may  have  much  more  uni- 
form properties  than  the  tree  as  a  whole.  We  present  evidence 
in  chapter  7  that  the  theory  of  (P,Q)-trees  can  be  applied  with 
good  predictive  power  to  a  branch  and  bound  algorithm  for  solv- 
ing traveling  salesman  problems,  and  we  conjecture  that  the 
model  is  applicable  to  at  least  those  branch  and  bound  algo- 
rithms which  employ  a  relaxation  procedure. 

Some  notation  follows  which  will  be  needed.  For  a  random 
variable  x  and  a  probability  mass  function  p(x),  the  expected 
value  and  variance  of  x  are  computed  by 


E(x)  =  1    x  p(x) 
x 

<j£  =  E(x2)  -  E(x)2. 


The  first  and  second  moments  of  P   will   be   denoted   P   and   P 
respectively,  i.e., 


P=  1   kP(k)   and 

k>0 


P  =  I   k  P(k)  . 
k>0 


3.2   Properties  of  a  Class  of  (P,Q)-trees. 

Before  studying  the  behavior  of  BB  on  (P,Q)-trees  it  will 
be  useful  to  develop  expressions  for  some  important  properties 
of  a  class  of  (P,Q)-trees.   For  example  what   is   the   expected 
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path  length  of  a  randomly  picked  path  in  a  randomly  picked 
tree?  The  probability  that  a  node  is  a  leaf  is  P(0)  and  the 
probability  that  a  node  has  some  sons  is  1-P(0).  A  branch  of 
length  k  then  has  probability  (1-P (0 ) ) kP (0 ) ,  a  geometric  dis- 
tribution.  The  expected  path  length  is 


00  k 

5  k(l-P(0) )*P(0)  =  (1-P(0) )/P  (0) . 

k=0 


(1) 


A  more  difficult  question  concerns  the  distribution  of  least 
cost  leaves  over  the  class  of  (P,Q)-trees.  Let  opt(T)  denote 
the  cost  of  the  least  cost  leaf  in  an  arc-labelled  tree  T.  Let 
0(i)  denote  the  probability  that  opt(T)=i  in  a  random 
(P,Q)-tree  T.  0  is  defined  on  the  nonnegative  integers  since 
the  cost  of  any  leaf  in  a  (P,Q)-tree  is  a  nonnegative  integer 
by  definition.  A  recurrence  relation  for  0  can  be  formulated 
by  equating  two  expressions  for  the  probability  that  opt(T)>i 
in  a  random  (P,Q)-tree  T.  First  note  that  no  arcs  can  have  a 
cost  of  zero  so  the  only  way  that  a  tree  can  have  a  least  cost 
leaf  of  cost  zero  is  if  the  root  is  terminal,  thus  0(0)  =  P(0). 
One  expression  for  the  probability  that  opt(T)>i  is 


1  -  I   0(k) . 
k=0 


(2) 


Next  consider  the  treetop  shown  in  Figure  3.3a.  The  subtrees 
T,,T2,...,T-  are  themselves  random  (P,Q)-trees.  The  probabili- 
ty that  opt(T'  )>i  where  T*.  is  the  kth  subtree  plus  the  arc 
from  the  root  as  in  figure  3.3b  is 


is 
1-5    i  Q  (c)O(s-c)  . 
s=l  c=l 


(3) 
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Figure  3.3a.   A  Tree-top. 


3. 3b.   A  Branch 


This  expression  sums  over  all  combinations  of  arc  costs  c  and 
costs  of  least  cost  leaves  within  T  (letting  s  denote  the  least 
cost  leaf  of  the  combined  arc  and  subtree,  s-c  is  the  cost  of 
the  least  cost  leaf  of  the  subtree)  for  which  the  sum  is  not 
greater  than  i.  Since  this  expression  applies  independently  to 
each  of  any  number  of  branches,  the  probability  that  the  tree- 
top  of  Figure  3.3a  has  j  branches  and  opt(T)>i  is 

i    s 
P(j)  [1-5    5.    Q(c)0(s-c)]:). 
s=l  c=l 

For  i>0  the  probability  that  opt(T)>i  in  a  random  (P,Q)-tree  is 


co         is* 
2    P(j) [1-  2         2    Q(c)0(s-c) ] J. 
j=l        s=l  c=l 


(4) 


The  case  j=0  is  not  included  in  this   expression   because   then 
opt(T)  =  0.   Finally  expressions  (2)  and  (4)  can  be  equated: 


l  -       co  i    s 

1  -  I   0(k)  =  I      P(j)[l-  I        2   Q(c)0(s-c)] 3. 
k=0       j=l        s=l  c=l 


(5) 


This  is  a  recurrence  relation  since  0(i)  appears   on   the   left 
but  only  the  values  0(0),  0(1),  ...,  O(i-l)  appear  on  the  right 
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for  i>_l.  In  the  appendix  this  recurrence  relation  is  broken 
down  into  simpler  recurrence  relations  in  order  to  speed  up  the 
computation  of  0.  Except  for  special  P  and  Q  this  recurrence 
relation  seems  to  have  no  general  analytic  solution.  Empirical 
data  on  uniformly  distributed  P  and  Q  suggests  that  0(n)  is 
asymptotic  to  dn  as  n->co  for  some  constant  d  that  depends  on  P 
and  Q.  Figure  3.4  shows  some  of  0  for  the  class  of 
(P10f  Q100)-trees  where  P10(k)  =  1/11  if  and  only  if  0£k£10,  and 

Q.__(c)  =  1/100  if  and  only  if  Kc<100. 

100  •*      —  — 

Figure  3.4.   0(i)  for  (P10, Q100) -trees . 
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Let  dep(T)  denote  the  least  depth  at  which  a  least  cost 
leaf  may  be  found  in  an  arc-labelled  tree  T.  A  generalization 
of  the  function  0(i)  is  the  function  d(i,k)  =  probability  that 
opt(T)=i  and  dep(T)=k  in  a  random  (P,Q)-tree  T.  In  a  manner 
similar  to  the  derivation  of  (13)  above,  a  recurrence  relation 
can  be  formulated  for  d(i,k)  by  equating  two  expressions  for 
the  probability  that  opt(T)>i  or  (opt(T)=i  and  dep(T)>k). 
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i-1   j  k 

1  -   2   I   d(j,m)  -  2  d(i ,m) 

j=0  m=0         m=0 

go         i             i-2  i-c-1  m 

=   2  PC j) C 1  -  2  Q(c)d(0,0)  -  2   Q(c)  2   2  d(m,n) 

j=1       c= 1             c=1  m=1  n=1 

i-1  k-1         , 

-  2   Q(c)  2  d(i-c,m)]J 

c=1  m=1 


(6) 


where   d(0,0)  =  P(0) 

d( i  ,m)  =  0  for  m>  i 
d(i,0)  =  0  for  i>0 

Note  that  the  left  hand  side  of  (6)  nas  the  term  d(i,m) 
whereas  the  rightnand  side  uses  only  the  terms  d(j,k)  for  j<i 
or  ( j= i  and  K<m) . 

By  taking  marginal  sums  of  d(i,m)  we  obtain  two  important 
functions  concerning  the  class  of  (P,Q)-trees.  First  we  can 
derive  the  recurrence  relation  (4)  for  0(i)  again  since: 


CO  1 

0(i)  =  2      d(i,m)  =   2  d(i,m)   for  i>0 
1=0  m=1 


(7) 


Secondly,  let  DEP(m)  be  the  probability  that  dep(T)=m  in  a  ran- 
domly generated  (P,Q)-tree  T.  This  function  is  given  by 


co 
DEP(m)  =   2   d(i,m) 
i=m 


(8) 


Figure  3-5  shows  an  example  distribution  of  DEP(m).   Note   that 
DEP(m)  quickly  approaches  zero  as  m  increases. 
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Figure  3.5.   DEP(m)  for  the  class  of  (P 1 Q ,Q1 Q0 )-trees 
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Chapter  4. 
Heuristic  Search  Strategies. 

Subsequent  chapters  will  investigate  same  particular  well 
known  search  strategies  but  it  seems  appropriate  to  first  look 
at  search  strategies  in  general.  In  chapter  2  the  function 
ET(h,p,b)  was  defined  as  the  time  complexity  for  solving  the 
problem  instance  p  when  BB  uses  the  search  strategy  realized  by 
the  heuristic  function  h  and  is  given  an  initial  bound  of  b. 
Define  the  expected  time  complexity  of  a  heuristic  search  of  a 
random  (P,Q)-tree  as 

ET,(b)  =  5.   Pr(t)ET(h,t,b)  (1) 

t 

where  t  varies  over  all  (P,Q)-trees  and  Pr(t)  is  the  probabili- 
ty of  t  as  defined  in  the  previous  chapter.  Of  particular  in- 
terest for  comparison  purposes  in  this  thesis  will  be  the  limit 
as  b  goes  to  co  of  ETh(b)r  denoted  by  E(N?)  or  ET,  (co)  . 

The  time  complexity  of  a  branch  and  bound  search  may  be 
viewed  as  a  random  sum  of  independent  variables.  The  i  such 
variable  is  the  number  of  sons  inserted  on  the  queue  by  the  i 
explored  node.  Let  G,(k)  denote  the  probability  that  exactly  k 
nodes  are  explored  during  the  search  of  a  random  (P,Q)-tree 
under  search  strategy  h.  If  we  let  X.  denote  the  number  of 
sons  that  the  i    explored  node  has  then  the  size  of  the  search 
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tree,  denoted  by  the  random  variable  N,_,  is  a  random  sum  1  +  X, 


+  x2  + 


, .  +  X.  where  each  X.  is  distributed  according  to  P  and 
k  is  distributed  according  to  G-.  See  section  3.1  for  the  de- 
finitions of  F  and  F. 


Theorem  4.1.   Let  N_  =  1  +  X,  +  X~  +  . . .  +  X,  be  a   random   sum 
where  each  variable  X.  is  distributed  according  to  P  and  k  is  a 


random  variable  distributed  according  to  G.  ,  then 
E(NT)  =  1  +  PGh 

\    =  PG  +  P2(G=h  -  G,  -  G-2). 


(2) 

(3) 


Proof:  Let  p(z)  and  g(z)  be  the  generating  functions   for 
P  and  G,  respectively,  i.e. 


CO 


g  (z)  =   i  G,  (n)  z  , 

n=0 


co 

p(z)  =  1    P(n) z  . 

n=0 


It  is  a  well-known  result  (c.f.  Feller  1959  pg .  286-287)   that 
the   generating   function  for  the  random  sum  X,  +  X~  +  . . .  +  X. 

is  g(p(z)).   It  can  also  be  shown  (Feller   1959,  pg .   265-266) 

that   for   a  variable  x  distributed  according  to  F(x)  with  gen- 
erating function  f(z), 

f(l)  =  1  (4) 

E(x)  =  f *  (1) (first  derivative  of   f ) ,  (5) 
Let  f"  denote  the  second  derivative  of  f,  then 


f  "  ( 1 )  =  F  -  F 

cr2.  =  f»(l)  +  f  (1)  -  V   (l)2. 


(6) 

(7) 


From  these  relations  it  is  straightforward  to  derive  the   rela- 
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tions  of  the  theorem.   In  what  follows  we  use  the  symbol  D    in 
the  usual  way  as  the  derivative  with  respect  to  z. 


E(NT-1)  =  Dz  g(p(z)) lz=1 

=  g' (p(z))p«  (z) lz=1 
=  g"  (P(D)P'  (1) 
=  g'  (Dp'  (l) 

=  GP. 
Thus  we  have  shown 


by  (5), 


E(NT)  =  1  +  GP. 


The  variance  can  also  be  derived  straightforwardly. 
&2  g(P(z))lz=1  =  Dz  g' (p(z))p' (z) lz=1 

=  g"(p(z))p' (z)p' (z)  +  p" (z)g'  (p(z))  lz=1 


=  g"(Dp'  (ir  +  p"(Dg'  (1) 


(G  -  G)P2  +  (P  -  P)G. 


by  (6) 


Thus 


^N  -1  =  Dzg(p(z)) 'z=l  +   Dzg(p(z)) 'z=l   "   (Dzg(P(z)) lz=i) 


by  (7) 


=  (GP2  -  GP2  +  PG  -  PG)  +  PG"  -  P2G2. 


=  P2  (G  -  G  -  G2)  +  PG. 


2        2 
Now  since  cr     =  or      ,    we  have 

N  rn  ~  J-         L'  m 

CT2,   =  P2  (G  -  G"  -  G2)  +  PG".   QED 
NT 
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We  immediately  obtain  the  following  corollary. 

—      —  2 

Corollary  4.1:  E (N  )  exists  if  and  only  if  P  and  G  exist.    a 

exists  if  and  only  if  P,  G,  P,  and  5  exist. 


Since  the  function  P  is  assumed  given,  the  main  task  in 
the  analysis  of  a  heuristic  search  strategy  is  to  find  a  formu- 
lation for  G.  (k).  Lower  bounds  can  be  found  on  the  means  of  NT 
and  N_  for  any  h. 


Theorem  4.2.   For  all  exact  heuristic  functions  h, 
Eh(NT)  >  1  +  P/P(0)  , 


Eh(Ns)  >  1  +  (P-1)/P(0) . 


Proof:   By  lemma  2.1  any  exact  search  strategy   must   ex- 
it 
plore   the   least  cost  leaf  s   and  all  the  nonleaf  nodes  n  such 

* 
that  LB(n)<c(s  ).   Let  h  be  a  heuristic  function  for   a   search 

strategy  which  explores  those  nodes  and  no  others.  If  we  ima- 
gine the  nodes  of  a  tree  laid  out  in  a  sequence  in  order  of  in- 
creasing cost  then  h  explores  just  those  nodes  up  to  the  first 
leaf  in  the  sequence.  The  probability  that  h  explores  k  non- 
leaf  nodes  is  given  by 

G~(k)  =  (l-P(0))kP(0)  (8) 

i.e,  the  first  k  nodes  in  the  sorted  sequence   are   nonterminal 

* 
(each   with   probability  1-P(0))  and  exactly  one  leaf  s   is  ex- 
plored (with  probability  P(0)).   We  have 

03  k 

G~  =   5  k(1-P(0)  )KP(0)  =  (1-P(0))/P(0)  (9) 

h    k  =  0 
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Also   each   nonleaf   has   j   sons   with   probability   P'(j) 
P(j)/1-P(0)  for  j>l,  thus 


CO 


P'  =  5  jP(j)/(l-P(0))  =  P/(1-P(0)) 


By  theorem  4.1f 


E^(NT)  =  1  +  (1-P(0))/P(0)  *  P/(1-P(0)) 


(10) 


=  1  +  P/P(0)  . 
It  follows  that  for  any  exact  heuristic  function  h, 
Eh(NT)  >  1  +  P/P(0)  . 

The  space  complexity  of  a  random  (P,Q)-tree  under  any 
search  strategy  is  bounded  below  by  the  number  of  nodes  on  the 
queue  when  the  first  leaf  is  found  by  the  search.  Again  Nq  is 
a  random  sum,  but  a  random  sum  of  random  variables  which  are 
slightly  different  from  the  random  variables  X.  in  NT.  Let 
G',(k)  be  the  probability  that  k  nonleaves  are  explored  before 
the  first  leaf  is  found.  G'  (k)  is  the  same  as  GT(k)  formulat- 
ed above,  so 


G*h  =  (1-P(0))/P(0) . 


th 


During  the  exploration  of  the  i  node,  the  node  itself  is  re- 
moved from  the  queue  and  X,  nodes  are  added,  thus  the  net  in- 
crease to  the  queue  size  is  X.-l,  denoted  X1  . .  The  random 
variables  X'.  are  distributed  according  to  P'  where  P'(x)  = 
P  (x+l)/(l-P (0) )  .   We  have 

CO  CO 

P'  =  5  jp'(j+l)  =  5  JP(3)/(1-P(0)) 

j=0  j=0 
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00  CO 

s  (j+DP(j+l)  -    I  P(j+1) 
j-0  j=0 

P-(1-P(0)) 
1-p  (0) 


Therefore 

E(NS)  >.  G'hP'   (mean  of  the  random  sum  X'   +  X'   + +  X' 

P-  (1-P  (0) )  1-P  (0) 
1-P(0)     P(0) 

=  1  +  (P-l)/P  (0)  .   QED 


k'  ' 


4.2   Techniques  for  analyzing  a  heuristic  search  strategy. 

The  state  of  a  branch  and  bound  search  process  at  the  be- 
ginning of  statement  3  of  BB  (see  figure  2.2)  can  be  described 
by  the  state  of  the  priority  queue  and  the  value  of  the  bound. 
An  actual  branch  and  bound  process  may  be  described  by  a  se- 
quence of  such  states.  If  we  can  give  the  probability  that  a 
random  process  in  state  S.  will  be  in  state  S.  at  the  next  exe- 
cution of  statement  3  of  BB  given  only  that  the  current  state 
is  S.,  then  the  set  of  all  such  states  (including  an  initial 
state)  and  the  probabilities  of  the  transitions  between  states 
defines  a  markov  chain.  Formally  let  {S0,S,,  S -,  ...}  U  {F„, 
F, ,  ...}  denote  the  possible  states  of  a  search  where  S  = 
<b0;N„>  is  the  initial  state,  and  S.  =  <b;n, ,n?, . . . ,n,  >  where  b 
is  a  value  of  the  bound  and  the  nodes  n, , n~ , . . . ,n.  are  on  the 
queue  in  that  order,  i.e.,  h  (n-,  )  <h  (n2)  <  ...<h(ni<).  The  final 
states  are  denoted  by  F,  =  <b; >  where  b  is  the  value  of  the 
bound   and  the  priority  queue  is  empty  (an  empty  priority  queue 
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terminates  BB) .  Let  r--  denote  the  probability  that  the  pro- 
cess in  state  i  will  make  a  transition  from  state  i  to  state  j. 
Let  R  denote  the  (infinite)  transition  probability  matrix 
[r..].  Suppose  we  wish  to  describe  the  behavior  of  BB  under 
search  strategy  h  and  given  an  initial  bound  of  b0.  The  initial 
state  is  <b0;N0>.  The  transition  probabilities  may  be 
described  as  follows. 


The  transitions 

<b;nlfn2, ,n^>  ->  <b;n2, ,nk> 


(11) 


occur  with  probability  1  for  all  states  in  which  b£c(n1). 
These  transitions  reflect  the  act  of  pruning  the  subtree  below 
n,  as  effectively  happens  when  statement  5  in  figure  2.2  is 
false.   The  transitions 


<b;nlfn2f ,nk>  ->  <c(n]L);n2, ,nk> 


(12) 


occur  with  probability  P(0)  for  all  states  in  which  b>c(n,). 
These  transitions  reflect  the  action  taken  by  BB  on  states  for 
which  statement  5  is  true  and  statement  7  is  true.  (a  leaf  is 
found  which  improves  the  value  of  the  bound).   The  transitions 


(13) 


<b;nlf  n2,  .  .  .  ,  n^>  ->  <b;  m-^  ,m2,  .  .  .  ,mk_1+  ■> 

occur  with  probability  P ( j) Q (c1 )Q (c2) Q(c-)  where 

{m1  ,m2,  . . .  ,mk_1+  . }    =    {n^r^, . . . , nlc,n1+c1,n1+c2,  •  •  •  /n-^+c  • } 

and    h(m1 )  <h(m2)  <  .  .  .h  (m]<_1+ • )  . 


These  transitions  reflect  the  action  taken  by  BB  on  states   for 
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which  statement  5  in  figure  2.2  is  true  but  statement  7  is 
false;  (the  sons  of  a  node  are  added  to  the  queue  for  later  ex- 
ploration) . 

Since  a  node  is  removed  from  the  queue  at  each  transition 
the  number  of  transitions  from  the  start  state  to  a  final 
state,  called  the  first  passage  time  for  a  random  process,  is 
the  number  of  nodes  inserted  on  the  queue  during  the  search; 
i.e.,  the  time  complexity.  There  are  well-known  methods  for 
finding  first  passage  times  and  their  means  [Parzen  1973],  but 
they  are  not  of  much  help  when  the  transition  matrix  is  infin- 
ite.   A   sequence  of  states  {S^, S ,,..., S  }  is  reali  zable  if  S~ 

is  the  initial  state,  S    is   a   final   state,   and   for   0<i<n 

n  — 

r       >0.    The   set   of   all  realizable  sequences  defines  the 

bibi+l 
sample  space  on  which  our  random  variables  N   for  the  time  com- 
plexity, and  Ns  for  the  space  complexity  are  defined.   The  pro- 
bability of  a  realizable  sequence  is  defined  to  be 

n-1 
IDE  rQ 
i=0   i  i+1 

Insight  is  needed  into  the  nature  of  a  particular  search 
strategy  in  order  to  coarsen  this  sample  space  into  appropriate 
events  such  that  an  expression  for  the  expected  complexities 
can  be  derived.  Theorem  4.1  offers  some  help  in  this  direction 
by  defining  the  event  E  as  the  set  of  all  realizable  sequences 
in  which  exactly  n  transitions  of  the  types  (12)  and  (13)  ap- 
pear. We  have  defined  the  probability  of  event  E  as  G,(n). 
Since   P   is   given   and  it  is  assumed  that  P  can  be  found,  the 
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problem  of  finding  E  (NT)  reduces  to  finding   a   reasonable   ex- 
pression for  G  ,  perhaps  by  first  finding  G.  (n). 

Complementing  this  general  approach  to  finding  E (tO  and 
E  (N  )  for  a  heuristic  search  strategy,  there  is  a  more  ad-hoc 
method.  In  this  approach  we  observe  how  the  data  structure 
which  is  normally  used  to  implement  the  search  strategy  is  af- 
fected by  searching  a  random  (P,Q)-tree.  Special  properties  of 
these  data  structures  may  help  in  the  analysis.  For  example,  a 
depth-first  search  is  usually  implemented  using  a  stack.  The 
close  relationship  between  stacks  and  the  implementation  of  re- 
cursion suggests  that  a  recurrence  relation  may  be  the  best  way 
to  describe  an  algorithm  employing  a  stack. 
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Chapter  5. 
The  Best-Bound-First  Search  Strategy 

The  best-bound-first  search  strategy  as  realized  by  3-(l) 
chooses  to  explore  the  least  cost  unexplored  node  on  the  prior- 
ity queue.  The  following  theorem  gives  a  characteristic  pro- 
perty of  this  search  strategy. 

Theorem  5.1.  The  first  leaf  explored  in  an  arc-labelled  tree 
by  BB  under  the  best-bound-first  search  strategy  is  optimal. 

Proof:   Best-bound-first  explores  the  nodes  of  a  tree   in 

* 

order   of  increasing  cost.   The  first  leaf  s   which  is  explored 

then  is  optimal  since  all  nodes  (including  leaves)  which  could 
be  explored  subsequently  have  at  least  as  great  a  cost  as  s  . 
QED 


As  a  result  of  theorem  5.1  it  is  not  neccesary  to  explore 
all  stored  nodes  in  a  best-bound-first  search.  By  the  nature 
of  the  heuristic  function  all  nodes  on  the  queue  when  the  first 
leaf  has  been  found  have  a  cost  greater  than  or  equal  to  the 
cost  of  the  leaf.  Therefore  there  is  no  need  to  explore  any 
further  nodes  and  the  search  may  terminate.  More  generally, 
whenever  a  search  strategy  has  a  best-bound-first  component 
(e.g.  in  ordered-depth-f i rst ,  ordered-breadth-f irst)  as  soon  as 
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a  leaf  is  found  or  a  node  with  greater   cost   than   the   bound, 
then  no  more  nodes  need  be  examined  from  the  priority  queue. 

Theorem  5.2.   The  expected  time  and   space   complexities   of   a 
best-  bound-first  search  of  a  random  (P,Q)-tree  are, 

Ebbf(V  =  1    +  F/p(0>  (1) 

Ebbf(Ns)  =  1  +  (P-1)/P(0)  (2) 

Proof:  Since  the  best-bound-first  search  strategy  exam- 
ines nodes  in  order  of  increasing  cost,  it  is  a  special  case  of 
the  heuristic  function  h  analyzed  in  theorem  4.2.  Therefore 
the  results  derived  for  h  also  hold  for  best-bound-first 
search.    QED 

As  a  consequence  of  theorem  5.2  the  best-bound-first 
search  strategy  is  optimal  both  in  terms  of  time  and  space 
within  our  model.  Using  theorems  4.1  and  4.2  it  is  possible  to 
derive  expressions  for  the  variances  of  NT  and  Ng  also. 

Theorem  5.3.  The  variances  of  the  performance  of  a   best-bound- 
first  search  on  a  random  (P,Q)-tree  are 

cr?   =  P2/P(0)2  +  p7p(0)  (3) 

T 

o-2   =  TF-D2/P(0)2  +  (P-1)/P(0)  (4) 

S 


Proof:  The  random  variable  NT  is  a  random  sum  1  +  X-,  +  X2 
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+  where  X.>_1  has  the  probability 

P'  (Xj)  =  P(X.)/(1-P(0))  , 
and  the  probability  that  there  are  k  terms  (the  number  of   non- 
terminal nodes)  in  the  sum  is 

G(k)  =  P(0)  (1-P(0))k. 
as  in  theorem  4.2.   The  variance  of  N   is 


Nr 


=  P,2(G-G-G2)  +  P'G. 


(5) 


by  theorem  4.1.   We  have 


co 
P'  =  I    X 

X  =  l 


P (X)      _P 

1-P(0)  '  1-P(0)' 


,  ^   ^  v2  P(X) 


CD 

5  X 

X=l 


1-P(0)   1-P(0)' 


GO 


G  =  1    kP  (0)  (1-P(0) ) 
k  =  0 


k    1-P (0) 
P(0)  ' 


We  can  find  an  expression  for  G  -  G,  which  is  needed   in   order 

to  evaluate  (5),  as  follows. 

,  _    co   9  ^    co  . 

G-G  =   S  (k  -k)P(0) (1-P(0) )K  =  1    k(k-l)P (0) (1-P(0) )K 


k  =  0 


Let 


GO 


k  k 


GG(z)  =   2  P(0)(l-P(0))z 
k  =  0 


k=0 


P(0) 


l-z(l-P  (0) )  ' 


by  taking  the  second  derivative  of  GG  with   respect   to   z,   we 
get, 


GO 


GG"  (z)  =  1    k(k-l)P(0)  (1-P  (0)  pz 
k=0 

=  D2  P(0)  [l-z(l-P(0))]_1 


k_k-2 
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=  D„  P(0)  (1-P(0)  )  [l-z(l-P(0)  )  ] 


-2 


=  2P  (0)  (1-P(0) ) 
[1-Z(1-P(0))] 


3* 


Then  G-G  =  GG n  (1) 


2P  (0)  (1-P(0) ) 


[1-(1-P(0))]  J 

2(1-P(0))2 

2 
P(0) 

Substituting  all  these  terms  into  (5),  we  obtain, 

2    ,,  2 


P2 


NT    (1-P(0))2     P(0) 


/2d-p(0) r     d-p(0) )\  .     p     i-p(0) 

afa,2  '        p(0)2   }    1"P(0)  P(0) 


P(0 


P(0)  * 


The  expression  (4)  for  cr   can  be  derived  in  a  similar   manner 

S 
QED 


For  example  consider  the   class   of   (P  ,Q  )-trees   where 
P-(k)  =  1/r+l   for  0_<k<r,  and  Qg(c)  =  1/s  for  lj<c_<s . 

E(NT)  =  1  +  (r/2  -l)/(l/r+l)  )  =  0(r2)f 

cr2.  =  ((r/2)/(l/r+l)  -l)2  +  (r(2r+l)/6  -l)/(l/r+l) 

4 
=  0(r  )  . 


If  (P-,  Q  )  were  a  good  model  of  the  trees  generated  by  a  par- 
ticular branch  and  bound  algorithm  under  a  best-bound-first 
search  strategy  then   the   algorithm   would   have   an   expected 
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search  tree  size  of  0(r  )  and  a  variance  of  0(r  ).  Note  that 
the  space  complexity  of  best-bound  search  is  the  same  order  of 
magnitude  as  its  time  complexity. 

Some  comments  are  in  order  here.  First,  the  time  and 
space  complexity  of  a  best-bound-first  search  do  not  depend  on 
Q,  the  distribution  of  arc  costs.  Only  the  relative  ordering 
of  nodes  in  a  tree  is  important  to  a  best-bound-first  search, 
not  the  particular  costs.  This  fact  together  with  the  assump- 
tion of  mutual  independence  of  the  nodes  of  a  (P,Q)-tree  ac- 
count for  the  absence  of  Q  in  (1)  and  (2). 

Second,  although  the  assumption  of  our  model  doesn't  gen- 
erally hold  for  interesting  combinatorial  minimization  prob- 
lems, (1)  can  be  used  to  obtain  an  upper  bound  on  the  expected 
time  complexity  of  a  problem.  Simply  stated,  the  product  of  an 
upper  bound  on  F  (average  degree  of  a  random  node)  and  an  upper 
bound  on  1/P(0)  (P(0)  is  the  probability  that  a  random  node 
supplies  a  feasible  solution)  yields  an  upper  bound  on  the  ex- 
pected time  complexity.  In  chapter  7,  we  derive  bounds  along 
these  lines  for  a  subtour-el iminat ion  algorithm  for  the  travel- 
ing salesman  problem.  Following  the  discussion  above,  upper 
bounds  on  P  and  1/P(0),  namely  0(ln(n))  (order  of  the  natural 
logarithm  of  n)  and  n/e  respectively,  are  multiplied  to  obtain 
an  estimated  upper  bound  (=nln(n)/e)  on  the  expected  time  com- 
plexity. 
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Chapter  6. 
Depth-First  Search  Strategies 

6.1  Expected  Time  Complexity 

The  choice  of  which  node  to  explore  next  in  a  depth-first 
search  is  made  between  the  sons  of  the  most  recently  explored 
node  (if  any),  otherwise  the  sons  of  the  next  most  recently  ex- 
plored node,  and  so  on.  In  an  orde red-depth-first  search  (odf) 
the  sons  are  explored  in  the  order  of  increasing  cost  as  real- 
ized by  the  heuristic  function  in  equation  2- (2a).  If  the  sons 
are  explored  in  the  order  of  generation,  then  the  search  is 
called  a  generation-order-depth-first  search  (godf)  as  realized 
by  the  heuristic  function  in  equation  2-(2b).  Let  ET  df(b) 
denote  the  expected  size  of  the  search  tree  generated  by  BB  on 
a  random  (P,Q)-tree  using  the  ordered-depth-f irst  search  stra- 
tegy and  given  an  initial  bound  of  b.   Let  ET   a,r(b)  denote  the 

god  t 

corresponding  expected  value  for  a  generation-order-depth-first 
search.  Expressions  for  these  functions  can  be  formulated 
fairly  naturally  as  recurrence  relations.  Suppose  that  BB  is 
searching  a  tree  with  the  structure  shown  in  Figure  6.1  where 
each  subtree  T,,T2,...,T.  may  be  regarded  as  a  random 
(P,Q)-tree.  Let  b0  be  a  finite  initial  bound  (the  bound  on  the 
root)  and  let  bi  denote  the  bound  at  the  top  of  the  subtree  T- 
for   Ki<j.    Then   the   expected  size  of  the  search  tree  for  a 
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random  tree  with  this  structure  is 

1  +  ET(b1)  +  ET(b2)  +  .  .  .  +  ET(bj). 

Each  of  the  bounds  b-  for  l<_i_<j  is  less  than  b0  indicating  that 
a  a  recurrence  relation  may  be  set  up.  The  next  problem  con- 
cerns the  probability  of  a  given  bound  occuring  at  a  given 
node.  Consider  the  tree  in  Figure  6.1.  Given  an  initial  bound 
of  b0,  the  bound  on  the  subtree  T,  is  b0-c-,  so  BB  is  expected 
to  search  ET(b0-c,)  nodes  in  T,.  opt(T,)  =  m  with  probability 
0(m)  since  it  is  a  random  (P,Q)-tree.  The  same  holds  for  the 
other  subtrees.  Suppose  that  opt(T,)  =  m,  .  If  m-,>=b-,  then  the 
search  will  not  find  the  least  cost  leaf.  On  the  other  hand  if 
m.<b1  then  the  search  will  find  the  it.  To  summarize,  the 
bound  returned  after  searching  the  first  subtree  of  the  tree  of 
Figure  6.1  with  initial  bound  b0  is  min{b0,  c,+m,}.  The  bound 
on  the  subtree  T-  is  min{b0,  c,+m,}-c2-  Continuing  this  rea- 
soning  one  finds  that  the  bound  on  the  i    subtree  T.  is 

bi  =  min{b0,  c^n^,  c2+m2,  ...  ,  ci_1  +mi_1  }  -  ci     (1) 

where  m. ,m„ , . . . ,m . _,  denote  the  least  cost  leaves  of  the  sub- 
trees T, ,T~, . . . ,T . ,  respectively.  ET  evaluated  with  expres- 
sion (1)  yields  the  expected  size  of  the  subtree  T. . 

The  following  function  gives  the  expected  size  of  T.  over 
all  (P,Q)-trees.  Let  the  functions  wd0df(D'i)  and  wdqodf^bfi^ 
denote  the  expected  size  of  the  search  tree  of  T^  when  the  root 
is   given   an   initial   bound   of   b  for  an  ordered-depth-f irst 
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Figure  6.1   A  tree-top, 


search  and  a  generation-order-depth-first  search  respectively. 
An  expression  for  Wd  df(b,i)  can  be  found  by  essentially 
enumerating  over  all  possible  combinations  of  variables  in  (1). 


Wd   ,,(b,i)  = 
godf   ' 


co 


cx-l 


CO 


c.=l   m,=0 


co       co 
I      ...   5   Q(c,) 
m._1=0 


..Q(ci)0(m1) ...0(mi_1) 


ETgodf  (min{bfc1+m1/ . . . ,ci_1+mi_1 1-c^ 


(2) 


A  tree  with  a  structure  as  in  Figure   6.1   will   have   expected 
size 

This  expression  summed  over  all  j  (number  of  sons  of  the   root) 

gives  an  expression  for  ET   -,r(b): 
3  r  godr 


ET 


co         j 

oodf (b)  =   5  P(j) (1+  S  Wd(b,i)) 
y         i=0       i=l 


co       ] 
=1+2  P(j)  I   Wdrt,f(b,i) . 


(3) 


3:i'ij,i:rMgodf 

As  stated,  (2)  is  computationally  intractable;  however   it   can 
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refined  to  a  more  computable  form  as  given  in  appendix  A. 

The  order  of  examining  the  subtrees  of  figure  6.1  by  an 
ordered-depth-f irst  search  is  treated  as  follows.  In  an  arbi- 
trary (P,Q)-tree  with  this  structure,  the  arc  costs 
c,,C2f...»c.  are  unordered.  By  rearranging  the  tree  the  arc 
costs  can  be  brought  into  sorted  order.  Note  though  that  a 
given  ordered  sequence  c, <c~£. . .<c .  may  result  from  the  sorting 
of  many  distinct  sequences.  The  appropriate  combinatorial 
question  is  how  many  unique  arrangements  R. (c, ,0,, . . . ,c • )  of 
this  sequence  there  are.  There  are  i!  nonunique  arrangements 
but  repetitions  must  be  accounted  for.  If  k  of  the  i  values 
have  the  same  value  c . =c .,,=... =c-,.  then  there  is  a  repetition 
factor  of  k!  due  to  this  relation.   In  general 

where 

cl  =  ***=cr1  <cr,+l*""   Cri+r2<"' 

<cr,  +  r_  +  . .  ,  +  r.  ,=***=  cr .  +r-  +  . .  .  +r  * 

1   2       k-1         12       k 

and    r, +r^+. . . +r    =i .       (i.e.,    there   are      r,       variables      with      the 
12k  1 

same   value,    r_    variables    with    the    same   value,    and    so   on) 

Again  by  enumerating  over  all  possible  ordered  sequences 
c  , ,c.  and  m,, 'mi_i  of  the  variables  in  (1),  an  expres- 
sion   for    Wd    j*(b,i)    can    be    found. 

odr       ' 


wdodf(b,i)    = 


CD 


CD 

1 


Cl=1    C2=cl 


1 
Ci-l=ci 


Q(c1)Q(c2)...Q(ci)*Ri(c1,c2,...,ci) 
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00  GO  -n 

2      ...  2  0(ml)...0(mi-1 )* 

m    =0  m        =0 


■ET  ^  r.(min{b  ,  c- +m1  ,  Cp+mp  ,  .  .  .  ,c  .    ^  +m  .    ..}-c.) 


(4) 


A  tree  with  a  structure  as  in  Figure   6.1   will   have   expected 
size 

J 
1  +  2   Wd    (b,i). 
i=1 

This    expression    summed    over    all    j    (number    of   sons   of   the      root) 
gives    an    expression    for    ET  df(b): 

ETodf(b)    =      2    P(j)(1+  2   Wd0(jf(b,i)) 
j=0  i=1 

co  j 

=1+2   P(j)    2   Wd    ,f(b,i) .  (5) 

j=1  i=1 

Hereafter  we  will  omit  the  subscript  on  ET  and  Wd  except 
when  necessary  since  all  properties  to  be  given  have  the  same 
form  for  both.  (i.e.  the  following  theorems  will  hold  with  ei- 
ther subscript  added).  When  the  limit  of  the  sequence  ET(b) 
exists,  it  will  be  denoted  ET  (co)  .  Clearly  when  this  limit  ex- 
ists we  have  the  existence  for  each  i  of  the  limit  of  Wd(b,i), 
denoted  Wd(co,i).  In  corollary  4.1  we  found  necessary  condi- 
tions  for   the  existence  of  ET(co)  ,  i.e.  the  existence  of  F  and 

G_.   Henceforth  we  will  restrict  our  discussion  to   classes   of 
h 
(P,Q)-trees   for   which   ET(co)   exists.    We  suspect  that  those 

classes  of  (P,Q)-trees  for  which  the  limit  of  ET  does  not  exist 

are   not   particularly   interesting  in  that  they  cannot  be  good 

models  of  combinatorial  minimization  problems. 
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Several  results  about  these  limits  can  be  shown.  Theorem 
6.1  asserts  that  the  expected  size  of  a  search  tree  is  the  same 
as  the  expected  size  of  the  search  tree  of  the  first  subtree  of 
the  root. 

Theorem  6.1.   ET  (co)  =  Wd(co,l). 

co 
Proof:    limit   Wd(b,l)    =    limit      1      Q(c)ET(b-c)       by    (2), 
b->co  b->co     c=l 

co 
=      2   Q(c)limit    ET(b-c) 
c=l  b->co 

co 
=   ET(co)    1      Q(c) 
c=l 

=    ET  (co)  .         QED 

Theorem  6.2  gives  an  expression  for  ET  (co)  which  is  similar  to 
that  for  a  best-bound-first  search  given  by  eq.  5-1.  The  quan- 
tity W  is  the  expected  number  of  nodes  in  the  search  tree  ex- 
cept those  in  the  first  subtree. 

°?       2 
Theorem  6.2.   ET  (co)  =  W/P(0)  where  W  =  1  +   i  P(j)  1      Wd  (co,  i )  . 

j=2      i=2 

co      j 

Proof:  ET(co)=l+  1   P(j)  1   Wd  (co,  i ) 

j-1     i-1 

00  CO     j 

=  1  +  Wd(ao,l)  5   P(j)   +  I        1   Wd(ao,  i) 
j=l  j=2  i=2 

co       j 
=  1  +  ET(co)  (1-P(0))  +   S  P(j)   2  Wd(co,  i) 

j=2      i=2 

by  theorem  6.1  thus, 
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ET(oo)  =  [l+  2  P(j)  2  Wd(co,  i)]/P(0)  =  W/P(0) 
j=2     i=2 


QED 


We  can  reason  from  theorem  6.2  to  a  lower  bound  on  ET  (co) 
which  was  established  by  different  means  in  theorem  4.2. 

Proposition  6.2.   ET  (co)  >  1  +  P/P(0). 

co      j 
Proof:  W  =  1  +   2  P(j)  2  Wd(co,  i) 
j=2     i=2 

co      j 
>.  1  +  2  P(j)  2  1 
j=2     i=2 

co 
=1+2  (j-l)P(j) 
J-2 

00  CO 

=  1+2  jP(j)  -   2  P(j) 
j=2        j=2 

=  1  +  P-P  (1)  -  (1-P(0)-P  (1) ) 
=  p  +  P(0)  . 
Thus  ET(co)  =  W/P(0)  >  (P  (0)+P)/P (0)  =  1  +  P/P(0).   QED 

6.2   Time  complexity  as  a  function  of  the  depth   of   the   first 
leaf  found  in  a  depth  first  search  tree. 

The  depth  at  which  the  first  leaf  is  found  in  a  depth- 
first  search  has  a  strong  effect  on  the  performance  of  the 
search.  Intuitively  if  this  depth  is  deep  then  the  procedure 
will  spend  much  of  its  time  examining  nodes  in  that  part  of  the 
tree  before  returning  to  shallower  levels  where  the  true  least 
cost   leaf  may  lie.   It  might  be  conjectured  that  the  size  of  a 
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search  tree  tends  to  grow  exponentially  in  the  depth  of  the 
first  leaf  which  it  finds.  To  the  contrary,  in  what  follows  it 
is  shown  that  a  depth  first  search  tree  has  a  structure  which 
is  essentially  linear  in  the  depth  of  the  first  found  solution. 
Let  X(h)  be  the  expected  number  of  nodes  in  the  search  tree  ex- 
cept those  in  the  first  subtree  given  that  the  first  solution 
is  found  at  depth  h.  X(0)  is  defined  to  be  1.  See  figure  6.2. 
Let  S(h)  be  the  expected  number  of  nodes  searched  in  a  random 
(PfQ)-tree  given  that  the  first  solution  occurs  at  depth  h. 
From  these  definitions  one  finds 

S(h)  =  1  +  h  +  X(l)  +  X(2)  +  ...  +  X(h) 

h 
=  1  +  h  +  1   X(k)  )  (6) 

k=l 

In  order  to  formulate  an  expression  for  X(d)  an  appropri- 
ate variant  of  0  will  be  needed.  Let  0(d,i)  denote  the  proba- 
bility that  opt(T)=i  in  a  random  (P,Q)-tree  T  given  that  the 
leftmost  branch  of  T  has  length  d.  Similar  reasoning  to  that 
which  led  to  the  expression  3-(4)  for  0(i)  yields  an  expression 
for  0(d,i).  Again  the  method  is  to  equate  two  expressions  for 
the  probability  that  opt(T)>i.   One  expression  is 

i  ~ 
1-1   0(d, k) .  (7) 

k=0 

Suppose   now   that   the   root   has   j_>l   sons    with    subtrees 

TwT2,  .  .  .  ,T  •  .    This   event  occurs  with  probability  P(j)/1-P(0) 

for  d>0  because  the  condition  that   the   tree   has   a   leftmost 

branch   of  length  d  disallows  the  possibility  that  the  root  has 
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Figure  6.2.  The  Structure  of  a  Depth-First  Search  Tree. 


zero  sons.  The  probability  that  the  root  has  j  sons  given  that 
J7*0  is  P(j)/1-P(0).  By  assumption  subtree  T,  has  a  leftmost 
branch  of  length  h-1  so  0(h-l,i)  applies  to  it,  and  the  proba- 
bility that  opt (root+T1) >i  is 

i-1   s 

1-5    5  Q(c)0(h-l,s-c) 
s=l   c=l 

For  the  subtrees  T-, • • • , T  , .  .  . , T • ,  the  probability  that 
opt(T  +root)>i  is  again  given  by  3-(3)  since  these  may  be  ran- 
dom (P,Q)-trees.  Thus  summing  over  all  trees  T  with  leftmost 
branch  of  length  h  we  have  another  expression  for  the  probabil- 
ity that  opt(T)>i. 

i   s-1       ~  i   s-1 

s=l  c=0 


] 


T  =  1X  *W>  S=l  C  =  0 


j-l 
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Expressions  (7)  and  (8)  may  now  be  equated  to  establish   a   re- 

currence  relation  for  0(d,i). 

i  ~ 
1-5  0(d,k)  = 

k=0 

op  pM.       i   s-1      -  i   s-1      ~     •  , 

^  i-P(ci)     tl-  2    5  Q(s-c)0(d-l,c)]  [1-  1        2  Q(s-c)0(c)H  x  (9) 
j=lx  {    '  s=l  c=0  s=l  c=0 

where  0(0,0)  =  1  ,  0(0, i)  =  0  for  i>0. 

The  limit  of  this  sequence  of   probability   functions   is 

expressed  as  follows. 

i  ~ 
1  -   3  0(a>,k)  = 
k  =  0 

co  pM)       i   s-1  i   s-1       «     •  , 

*  rrpron  [1~  *    *  Q(s-c)0(a>,c)]  [1-  :>    1   Q(s-c)0(c)]J  x   (10) 
j  =  lx  *W>  s=i  c=0  s=l  c=0 

Let  LT,  denote  the  subset  of  (P,Q)-trees  whose  leftmost 
branchs  have  length  d.  Let  RT  denote  the  set  of  subtrees  of 
nontrivial  (P,Q)-trees  formed  by  deleting  the  leftmost  subtree 
of  the  root.  An  arbitrary  TCLT,  can  be  realized  as  the  graft- 
ing of  a  tree  from  LT,  ,  (with  attached  arc)  to  a  subtree  from 
RT  as  in  figure  6.3.  An  expression  for  X(d)  can  be  found  by 
summing  over  all  combinations  of  this  form.  Let  Y,(m)  be  the 
probability  that  a  tree  consisting  of  an  arc  plus  a  random  tree 

from  LT,  has  a  least  cost  leaf  of  cost  m. 
d 

m-l~ 
Ya(m)  =  I   0(d,m)Q(m-k)  (11) 

a       k=l 

Let  Z(m)  be  the  expected  size  of  the  search  tree   of   a   random 

tree   T   in   RT  given  an  initial  bound  of  m.   Then  X(d)  has  the 
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Figure  6.3.   Formation  of  an  arbitrary  tree  in  LT 


d* 


TieLTd-l 


T2eRT 


form 


ao 


X(d)    =      i    Y.Cm)     Z(m) 

k=l 


(12) 


Proposition  6.3.   For  all  m>_0 ,  Z(m)<Z(m+l)  and  limit   Z(m)   ex- 

m->co 

ists  . 


Proof:  The  first  part  of  this  lemma  is  just  a  particular 
case  of  theorem  2.2.  The  second  part  follows  from  our  assump- 
tion that  ET  (co)  exists  since  for  all  m,  Z(m)<ET(m).   QED 


Let  Z(oo)  denote  limit  Z(m). 

m->co 


Proposition   6.4.       limit    X(d)     is    bounded    above. 

d->ao 

m-1. 
Proof:  First  note  that  Y  (m)  =   2  0  (cof  k)Q  ( m-1)  , 

k  =  l 


co 
thus    limit    X(d)    =    limit         1    Yd(m)Z(m) 
d-Xo  d->co        m=0 


co 
=      I   Y(m)Z(m) 
m=0  ro 


co 
<    3    Yco(m)Z(oD) 
m=0 
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oo 
=    Z(oo)    5    Y    (m)=Z(co)  .       QED 
m=0  ro 

Let   X  (cd)    denote    limit   X(d).      These    propositions    support   one      of 

d->CD 

our   main   results   which   states   that   S(d)  grows  essentially 
linearly  in  d. 


Theorem  6.3.   S(d)  is  bounded  above  and  below  by  a  linear  func- 
tion of  d. 

Proof:  Since  X(k)>_l,  we  have 

d 
S(d)  =  1  +  1      X(k)   by  (6) 
k  =  l 

d 
>  1  +   5   1=1+  d. 

k=l 

Also  , 

d 
S(d)  =  1  +  l      X(k)   by  (6) 
k  =  l 

d 
<_   1  +  1        Z  (co)   by  proposition  6.4 
k  =  l 

=  1  +  Z  (co)  d . 

Therefore,  for  all  d,  we  have 

1+d  <    S(d)  <  l+Z(co)d.    QED 

Theorem  6.3  can  be  interpreted  as  follows:  The  depth 
first  search  tree  can  be  decomposed  along  the  path  from  the 
root  to  the  first  found  solution  into  groups  of  subtrees  whose 
expected  size  is  asymptotically  constant  (the  i  '  group  con- 
sists  of  the  2  ,3  ,...,j  subtrees  below  the  i  node  on  the 
path   from   the   root  to  the  first  found  solution) .   See  Figure 
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6.2.  Therefore  the  performance  of  branch  and  bound  is  expected 
to  degrade  linearly  with  the  depth  of  the  first  found  solution. 

6.3.  Expected  Space  Complexity  of  a  Depth-First  Search 

Let  ES(b)  denote  the  expected  space  complexity  of  a  ran- 
dom (P,Q)-tree  when  BB  is  given  an  initial  bound  of  b.  A  re- 
currence relation  for  ES  can  be  set  up  roughly  analogous  to  the 
recurrence  relation  for  ET.  First  we  note  that  the  space  com- 
plexity must  be  at  least  1  since  initially  the  root  is  stored 
in  memory.  In  terms  of  figure  6.1f  suppose  that  the  root  has  j 
sons.  At  this  point  in  the  search  the  root  node  has  already 
been  stored  and  removed  from  the  queue  so  we  have  j  nodes  in 
memory.  Let  D(b,i)  be  the  expected  space  complexity  of  the  i 
subtree  of  the  root  when  the  initial  value  of  the  bound  is  b. 
When  the  root  has  j  sons,  the  expected  maximum  amount  of 
storage  needed  during  the  search  of  the  i  son  is  D(b,i)  +  j- 
i.  The  term  j-i  accounts  for  the  number  of  nodes  at  depth  1 
remaining  in  memory  during  the  search  of  the  i  subtree.  The 
maximum  amount  of  memory  used  when  the  root  has  j  sons  is  the 
maximum  over  D(b,i)  +  j-i  for  all  i<_j  : 

max{l,  D(b,l)+j-l,  D(b,2)+j-2,  ...,  D(b,j)}. 
This  expression  only  has  the  value  1  when   the   root   has   zero 
sons   since   D(b,l)>_l  (when  a  subtree  is  searched,  at  least  the 
root  of  the  subtree  was  once  in  memory) ,  thus  we  find, 

co 
ES(b)  =  I    P( j)*max{l,D(b,l)+j-l,D(b,2)+j-2, . . . ,D(b,  j)  } 
j=0 
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Q3 


P(0)*1  +  5  P(j)max{D(b,i)+j-i} 
j-1    i<j 


(13) 


where  D(b,i)  is  the  expected  space  complexity  of  the  i  sub- 
tree of  the  root  given  that  the  initial  bound  is  b  and  can  be 
formulated  in  the  same  way  that  we  set  up  Wd(b,i): 


. .Q(c.)0(m1) ...0(m.  x) 


D(b,i)    = 

CO                    CO 

5      ...    I 

c1=l          Cj-1 

CO 

m1=0 

CO 

.      5          Q(c,) 

mj-l=0 

ES(min{b,c1+mlf...fc.1+m-1}-c- 


(14 


This  expression  can  be  considerably  simplified  by  noting  that 
for  the  same  reason  that  ET(b)  is  monotonically  increasing  in  b 
(see  theorem  2.2),  so  also  is  ES (b) . 

Proposition  6.3.  For  all  j>_l,  D(b,l)  >_D(b,j). 

CO 

Proof:       D(b,l)    =      1      Q (cl ) ES  (b-cx ) 

c1=l 


CO  00  CO 

=  5.      . . .    i 


co 


1      ...      1        Q(c .) . . .Q(c .)0(m    ) . . .0(m.    ,) *ES(b-c,) 
c1=l         c=l    m1=0         m._1=0  J  J  J 


QD 


CD                     CO  CO 

<    1      ...    I  I      ...       I           Q(c ,) ...Q(c.)0(m,) ...0(m,_ ,)* 

c    =1        c=l  m    =0        m.    .  =0                              :                              J 

1                ]  1                ]-l 


ES(min{b,c,+m1,...,c.1+m._1}-c.) 


(since  b-c-  >_  min{  b/C-^+m^  ,  .  .  .  ,c  •  ^+m  •  ^  }  -  c-  and  ES   is   mono- 
tonically increasing  in  b) 


=  D(b, j) .    QED 
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As  a  result  of  the  above  lemma, 

oo 
ES(b)  =  P(0)  +   5  P(j)max  (D(b,i)  +  j-i} 
j=l     i<j 
oo 
=  P(0)  +  I   P(j) (D(b,l)  +  j-1)    by  the  lemma, 

j=l 
=  P(0)  +  P-l  +  (1-P(0)  )D(b,l) 

oo 
=  P(0)  +  P-l  +  (1-P(0))  S  Q(c)  ES(b-c).  (15) 

c=l 

The  limit  of  Depth  can  found  as  follows. 

ES(co)  =  limit  ES(b) 
b->co 

oo 
=  limit  (P(0)  +  P-l  +  (1-P(0))  2   Q(c)ES(b-c)) 
b->ao  c=l 

CO 

=  P(0)  +  P-l  +  (1-P(0))  I   Q(c)ES(co) 

c=l 

=  P(0)  +  P-l  +  (1-P(0)  )ES  (go)  . 

Thus  ES(co)  =  (P(0)  +  P-1)/P(0)  =  1  +  (P-1)/P(0).  (16) 

Suppose  we  need  to  estimate  the  maximum  depth  of  explored 
nodes  in  a  depth-first  search.  A  slight  alteration  of  the 
above  arguments  accomplishes  this  goal.  Let  Depth (b)  =  maximum 
depth  of  an  explored  node  in  a  depth-first  search  of  a  random 
(P,Q)-tree.  A  recurrence  relation  for  Depth(b)  can  be  formu- 
lated by  a  slight  alteration  of  (13). 

Depth (0)  =  1, 

CO 

Depth(b)  =1+2   P(j)*max  (D(b,i)}  (17) 

j=l       j<i 

where  D(b,i)  is  defined  above  in  eq .  (13).   Again  using   propo- 
sition 6.4,  we  can  simplify  (16)  to 


-70- 


00 

Depth(b)    =   1    +      5  P(j)*D(b,l) 
j  =  l 

00  CD 

=   1    +     2  P(j)      1   Q(c)Depth(b-c) 
j=l  c=l 


and 


limit    Depth(b) 
b-XD 


CO  CD 

limit    1+5      P(j)      I      Q(c)Depth(b-c) 
b->co  j  =  l  c=l 


CO 


=   1    +      I      P(  j)  *Depth(co) 
=    1    +   Depth  (co)  *(1-P  (0)  )  . 


Therefore 

Depth  (co)  (1    -    (1-P  (0)  )  )    =    1 


which    yields 

Depth  (co)    =    1/P  (0)  . 


(18) 
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Chapter   7. 
An    Application    to    the    Traveling    Salesman    Problem 

In  this  chapter  we  will  show  how  the  results  of  the  pre- 
vious chapters  can  be  applied  to  a  branch  and  bound  algorithm 
called  a  subtour-el imination  algorithm  for  solving  asymmetric 
traveling  salesman  problems  .  The  TSP  may  be  stated  as  follows 
using  the  terminology  of  chapter  1 :  we  want  to  find  a  construc- 
tive proof  of  r3xYy  f(x)<_f(y)  where  x,y  are  cyclic  permuta- 
tions of  n  objects  or  hamiltonian  cycles  on  a  complete  directed 
graph    with   n    nodes,    and 

i=  1      '    i 

where  [c.  .]  is  an  nxn  asymmetric  matrix  giving  the  cost  of  the 
i  >  J 

directed  arc  from  i  to  j.   The  size  of  the  instance  is  n. 

A  model  of  a  particular  branch  and  bound  algorithm  is  an 
appropriate  choice  of  P  and  Q  functions  parameterized  by  the 
problem  size.  We  will  develop  such  P  and  Q  functions  for  the 
subtour-eliminat ion  algorithm  described  in  chapter  2  by  study- 
ing the  behavior  of  the  algorithm  on  the  initial  feasible  space 
of  permutations.  Again,  this  algorithm  makes  use  of  a  relaxa- 
tion of  the  requirement  that  feasible  objects  be  cyclic  permu- 
tations, and  the  initial  feasible  set  is  the  set  S  of  permuta- 
tions of  n  objects.   The  set  S   is  a  symmetric  set  in  the  sense 
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that  for  any  given  pair  n  ffpeSn  there  is  a  relabelling  (auto- 
morphism) of  the  permutations  of  S  such  that  n,  is  mapped  into 
np.  From  this  property  it  follows  that  all  permutations  are 
equally  likely  to  be  the  least  cost  permutation  initially. 

We  have  a  class  of  cost  matrices  whose  entries  are  in- 
dependently and  identically  distributed  random  variables.  The 
problem  is  to  find  the  least  cost  permutation  with  respect  to  a 

given   matrix.   There  are  n!  permutations  in  S   and  (n-1)!   cy- 

r  n 

clic  permutations  (We  can  fix  any  of  the  n  elements  of  an  n- 
cycle  as  a  starting  point.  Thereafter  there  are  (n-1)!  ways  to 
arrange  the  remaining  n-1  elements  to  close  the  cycle).  We 
find  then  that  the  probability  that  the  least  cost  permutation 
is  cyclic  is 


P(0)  =  (n-1 ) !/n!  =  1/n. 


(1) 


Let 


n 

2      1/k 
k=1 


for  all  n 


The    numbers    H      are   called    harmonic  numbers    [Knuth    1969]    and    oc- 
n 

cur       frequently      in      the      analysis  of      algorithms.      There    is    a 

well-known    asymptotic    expansion    of  these    numbers 


H      =    ln(n)    +    /  +    1/(2n)    -   0(n"2) 
n 


(2) 


where  y  =  0.577...  is  called  eulers  constant.   From  (2),  we  ob- 
tain the  following  bounds  on  H  , 

ln(n)  +  Y   <  Hn  <  ln(n)  +  y  +  1/(2n). 
The  following  theorem  helps  us   obtain   asymptotic   values   for 
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P(k)    when    k>  1 . 


Theorem  7.1.  Let  S(n,k)  denote  the  probability  that  a  randomly 
picked  n-permutation  is  composed  of  cycles  each  of  order 
greater  than  k  assuming  that  all  permutations  are  equally  like- 
ly.     Then 

"Hk 
limit    S(n,k)    =    e  for    k>1 . 

n->co 


Proof:  We  will  proceed  by  induction  on  k.  First  note 
that  by  definition  the  number  of  n-permutations  whose  cycles 
all  have  order  greater  than  k  is  n!S(n,k).  For  the  basis  of 
the  induction  we  note  that  all  n-permutations  are  composed  of 
cycles   of   order    greater    than    0.       So    for    all    n,      S(n,0)      =      1       = 

"Ho 


Assume  now  that 


-H 


limit  S(n ,k-1 )  =  e 
n-Xo 


k-1 


for  some  k>0. 


The  probability  S(n,k)  can  be  formulated  as  ( 1 /n !)* (number  of 
permutations  whose  subcycles  all  have  order  greater  than  k)  . 
We  will  use  the  principle  of  inclusion-exclusion  [Liu  1968]  in 
order  to  get  S(n,k)  essentially  by  subtracting  the  number  of 
permutations  which  contain  some  cycles  of  order  k  from  the 
n!S(n,k-1)  permutations  which  have  cycles  all  of  order  >  k-1. 
First  of  all  there  are  n!S(n,k-1)  permutations  whose  cycles 
have  order  greater  than  or  equal  to  k.  Suppose  now  that  we 
select  k  nodes  (regarding  them  as  material  for  a  cycle  of  order 
k) .   There  are  (r)  ways  to  select  k  nodes,  k-1!  ways  to  arrange 
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them  in  a  cycle,  and  there  are  ( n-k) ! S(n-K ,k-1 )  ways  to  form 
permutations  on  the  remaining  n-k  nodes  such  that  all  cycles 
have  order  greater  than  or  equal  to  k.  Suppose  next  that  we 
select  two  sets  of  k  nodes.  There  are  ([Jm"^)  ways  to  select 
them,  ( k-1 ) ! ( k-1 ) !/2 !  unique  ways  to  arrange  the  two  sets  into 
two  cycles  of  order  k  (the  divisor  2!  is  the  number  of  ways  of 
picking  the  same  set  of  two  cycles)  ,  and  there  are 
( n-2k) ! S( n-2k ,k-1 )  permutations  of  the  remaining  n-2k  nodes 
such  tnat  all  cycles  nave  order  greater  than  or  equal  to  k.  In 
general  suppose  we  select  m  disjoint  sets  of  k  nodes  and  ar- 
range   each    set    into    a   cycle      of      order      k.         There      are      ^u^^u  ) 

z  n-mk+kN  ,  ,  , ,     , N , m .    , 

...(  )       ways    to    pick  m    such    sets,    (k-1)!    /m!    ways    to    ar- 

range these  sets  into  cycles  of  order  k  (there  is  a  repetition 
factor  of  m!  because  each  particular  arrangement  of  the  m  cy- 
cles can  be  permuted  in  m!  ways),  and  finally  there  are  ( n- 
mk) ! S( n-mk ,k-1 )  ways  to  arrange  the  remaining  n-mk  nodes  into 
permutations  composed  of  cycles  of  order  greater  than  or  equal 
to    k  . 

Applying    the    principle   of    inclusion-exclusion    we    find 

S(n,k)    =   l/z*    (-1)m(k-1,)!m(")(n"k)...(n-f+  k)  (  n-mk)  !S(  n-mk  ,k-1) 
m  =  0 

^K(-1)m(k-1)m       nl  (n-k)  I  £?7y:^i(n-mk)!S(n-mk,k-1) 

c _    n!  in!       r!  ( n-k)  !  k!  ( n-2k)  !  k!  ( n-mk)  !  ' 

m=0 

n/k    ,    i/k)Hi 

=      2      -       ,  S(n-mk,k-1). 

m ! 
m  =  0 

When    we    take    the    limit    of    this    function,    we    get 
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n/k 


m 


u/  *    (.1/1/)'" 
limit    S(n,k)    =    limit      2  „,,  S(n-mk,k-1). 

n->co  n-Xo     m=0 


pothesis) 


m 


cx> 


m 


(~1(k)      limit    S(n-mk,k-1). 
m=0        m*  n->oo 


00     (-l/k)m      ~Hk-1 

2      ^    ytK)      e      K    '         (by      induction      hy- 


m=0 


m! 


"Hk-1    -1/k 
=    e  e 


-H, 


=    e 


QED 


An    immediate    corollary   of   theorem   7.1    is      the      well-known 

-H1 
result   that  there  are  n!S(n,1)  which  is  asymptotic  to  n!e    = 

n!/e  n-permutations  which  do  not  have  any  1-cycles  (this  is 
known  as  the  problem  of  derrangements  [Riordan  1958;  Liu 
1968]).  Our  intended  application  of  theorem  7.1  is  the  proba- 
bility that  the  least  cost  permutation  has  k  sons  (its  smallest 
subcycle    is   of   order    k) . 


Theorem  7.2.  The  asymptotic  probability  that  the  least  cost 
n-permutat ion  on  a  random  cost  matrix  has  a  smallest  order  cy- 
cle   of   order    k    is 


-H 


limit    Pn(k)    =    e 
n-Xo 


k-1 


-H 


-    e 


k 


(3) 


Proof:  We  have  already  noted  that  each  n-permutat ion  is 
equally  likely  to  be  the  least  cost  permutation  over  a  random 
cost  matrix.  The  probability  that  a  random  permutation  ti  has  a 
smallest    suocycle    of   order    k    is    the    probability    that    the    cycles 
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of  n  have  order  greater  than  k-1  minus  the  probability  that  the 
subcycles  of  ti  have  order  greater  than  k.  The  theorem  then 
follows  directly  from  theorem  7.1. 

The  probability  that  the  least   cost   permutation   has   a 

-HQ         -h1 
1-cycle      is      roughly   e  -    e  =    1    -    1/e   =    0.63...     •      Since    a 

traveling    salesman    tour    cannot   have    any    1-cycles,    if    we      insert 

infinities    along    the    diagonal    of   our    random    cost    matrices    we    do 

not    lose    any   cyclic    permutations    yet    reduce      the      size      of      the 

feasible    space    by    about    63%.       Unfortunately    there    is    no    readily 

apparent    analogous      method       for      precluding      permutations      with 

2-cycles      (or    nigher    order    cycles).       We    can    estimate    the    proba- 

Dility  that  a  cyclic  permutation  is  optimal  with  respect  to  the 

altered  matrix  as 


P»(0)  =  (n-1 ) !/(n!/e)  =  e/n. 


(4) 


It  cannot  be  shown  that  (4)  is  asymptotically  correct  as  easily 
as  (1)  can  be  shown  correct  because  the  set  of  permutations 
without  1-cycles  is  not  symmetric  in  the  sense  given  above. 
Nonetheless  observations  of  randomly  generated  traveling  sales- 
man problems  supports  (4).  See  Table  1  and  Table  2.  Next  we 
might  ask  how  P(k)  is  affected  by  this  alteration  of  the  cost 
matrices.       Let 

P*(k)    =    Pr(tne    smallest    cycle    of   *    has    order    k|     k> 1  ) 

=    P(k)/(1-P(1 )) 
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(e~   k"1    -    e"   k)/(1    -    (1-1/e)) 


e(  e  -    e        )  . 


(5) 


Given    (5) ,    we   can    find    an    upper   bound   on  F1  . 

n/2 

F'    =      2   kP'(k) 

k=2 

n/2  -H  -H 

=      2    ke(e      K    '    -    e      K) 
k  =  2 

=    e[2(e      ]-e      ^)+3(e      2-e      3)+...       +(n/2)(e      n/^_1-e      n/^)] 


-H  -H     ,  n/2-1      -H 

=    e[e      ]    -    (n/2)e      n/d   +        2        e      R] 

k=1 

<    eCe"1    -    (n/2)(2/n)e-V1/2(n/2)    +    '"'I'"1    e~\ 

k=1 

(we    have   made    use   of    the    bounds    obtained    above   on    W.) , 


=    1    -    e1"Vl/n   +    e^H 


(n/2)-1 * 


(6) 


With  P'(0)  and  F"  we  can  now  test  our  estimated  expected 
time  for  solving  randomly  generated  traveling  salesman  problems 
using  a  subtour-el iminat ion  algorithm  under  the  best-bound- 
first  search  strategy.  Inserting  the  bounds  (5)  and  (6)  into 
equation    3-9,    we    have 

E(NT)    =    1    +   P/PCO) 

<    1    +    [1    -    e1-ye-1/n   +    e1"/H(n/2)_1 ]/(e/n) 


=    1    +    n/e   -    ne"^e"1/n   +    e'^nH 


n/2-1 


(7) 
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=  0(nln(n)) 


(7') 


For  each  of   the   nodes   counted   by   NT,   we   solve   the 
corresponding   assignment   problem.   As  stated  above  in  chapter 

2,  these  assignment  problems  take  O(n^)  time  at  the   root   (the 

p 
initial   problem)   and  0(n  )  time  for  subsequent  problems.   The 

only  other  factor  in  the  running  time  of  the  algorithm   is   the 

time   required  to  maintain  the  priority  queue.   Using  available 

techniques  for  implementing  priority  queues  [Aho,  Hopcroft,  and 

Ullman   1974]   the   time  required  to  insert  or  access  a  node  in 

the  queue  when  n  nodes  are  in  it  is  0(ln(n)).   The   access   and 

insertion   time   per   node   for   any  branch  and  bound  algorithm 

depends  on  the  order  of  magnitude  of  the  space  complexity,   the 

maximum   number  of  nodes  in  storage  during  the  search.   For  the 

sub  tour-el iminat ion    algorithm    the   space   complexity   is 

0(nln(n)),  thus  the  mean  queue  maintenence  time  is 

0(ln(nln(n))  )  =  0(ln(n)  +  lnln(n))  =  0(ln(n)). 

Putting    these    quantities    together    ,    we    expect    the    running      time 

of    the    subtour-eliminat ion    algorithm    to    be 

1*0(n3)    +   0(nln(n) ) *0 ( n2 )*0 ( ln( n) )    =    0(n3ln2(n)). 

In  table  1,  the  bounds  (7)  are  computed  for  several  values  of 
n.  Compared  with  these  values  are  empirical  values  of  E(N~) 
found  by  averaging  NT  from  1000  randomly  generated  traveling 
salesman  problems  for  each  value  of  n  solved  by  tne  subtour- 
elimination  algorithm  under  a  best-bound-first  search  strategy. 
Random      cost      matrices      were    generated    Dy   putting    independently 
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TABLE  1.  Data  from  the  solution  of  randomly  generated  travel- 
ing salesman  problems  by  a  subtour-el iminat ion  algorithm  using 
a  best-bound-first  search  strategy  compared  with  theoretical 
estimates   of   the   corresponding   values. 


Sample      Mean 

mean  search 

Size        No.    of      search  size   by    Sample 

of        problems      tree  eq.    7-7      F  at 
problem      solved      search      tree  root 


Bound 

on    P  Sample      P(0)    by 

by  P(0)    at    eq.7-4 

eq.7-6  root         (=    e/n) 


10 

1000 

6.48 

1  1.29 

2.03 

2.80 

.261 

.272 

15 

1000 

12.28 

19.27 

2.58 

3-31 

.186 

.  181 

20 

1000 

19.63 

29.44 

3-09 

3-87 

.153 

.136 

25 

790 

31  .19 

39.10 

3.50 

4.14 

.106 

.109 

and  uniformly  distributed  random  integers  between  1  and  1000  in 
eacn  entry.  The  diagonal  entries  were  set  to  a  very  large 
number.  Table  2  presents  data  on  the  probabilities  of  the 
various  branching  factors  of  nodes  at  different  depths  in  the 
search  tree.  Notice  that  P(0)  seems  to  increase  monotonically 
with  depth.  This  provides  evidence  that  e/n  is  indeed  a  lower 
bound  on  P(0).  (for  n=20,  we  have  e/20  =  0.136...  compared 
with  0.135  for  P(0)  at  depth  0).  The  most  dramatic  changes 
take  place  between  depth  0  and  depth  1.  In  particular  P(0)  al- 
most doubles  and  P(2)  roughly  halves.  Note  that  at  depth  0  tne 
sample   mean    is    3-018    whereas   our    estimated   mean    using    (5)    is, 

10  "Hk-1  "Hk 

2   ke(e      K    '    -    e      K)    =    2.982.  .  .     . 

k=2 

It  was  first  suggested  by  Bellmore  and  Malone  [Bellmore 
and  Malone  1971]  that  subtour-el imination  algorithms  exhibit 
polynomial  expected  time  behavior  on  randomly  generated  prob- 
lems.        The      proof      of      this    behavior    entirely    rests    on    showing 
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that  (1)  or  (4)  is  a  lower  bound  on  P(0)  (the  probability  that 
a  randomly  picked  node  in  a  randomly  picked  search  tree  has 
zero  sons)  or,  as  noted  in  [Rinnooy  Kan  and  Lenstra  1978],  that 
0(n~c)  for  any  constant  c  is  a  lower  bound  on  P(0).  This  im- 
portant  result    is    the   object  of  current   research. 

For  some  time  now  it  has  been  taken  as  a  general  guide 
that  if  an  algorithm  runs  in  polynomial  time  then  it  is  a 
tractable  problem.  If  the  algorithm  runs  in  superpolynomial 
time  then  it  is  intractable.  So  far  no  polynomial  time  algo- 
rithm has  been  found  for  any  NP-complete  problem,  so  they  are 
considered  intractable.  But  the  NP-complete  (and  NP-hard) 
problems  are  intractable  only  in  terms  of  a  worst-case  bound; 
no  known  algorithm  is  guaranteed  to  halt  with  a  solution  within 
a  polynomial  amount  of  time  at  present.  Here  though,  we  have 
in  the  traveling  salesman  problem  an  NP-complete  problem  which 
seems  to  be  solveable  on  the  average  in  polynomial  time.  Thus 
many  instances  of  the  traveling  salesman  problem  can  be  tract- 
ably solved  but  a  few  hard  instances  of  the  problem  cause  in- 
tractable behavior.  The  existence  of  such  problems  takes  some 
of  the  sting  from  the  possibility  that  P^NP.  One  mignt  reason- 
ably ask  whether  all  NP-complete  problems  are  solveable  in  po- 
lynomial expected  time.  In  fact  we  might  define  a  new  class  of 
problems  called  EP  which  are  solveable  in  polynomial  time  on 
the  average.  Certainly  P  EP  and  it  seems  that  the  traveling 
salesman  problem  is  in  EP.  Goldberg  [Goldberg  1979]  has  re- 
cently shown  that  the  satisfiability  problem  seems  to  be  solve- 
able     in    polynomial    expected    time.      A    proof    that    a    problem   nSNP 
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is  not  in  EP  constitutes  a  proof  that  P^NP  since  P  is  a  subset 
of  EP.  The  problem  of  course  with  this  definition  of  EP  is  the 
need  to  define  an  appropriate  probability  measure  on  a  problem. 
Pathological  probability  measures  can  be  found  which  emphasize 
the  hardest  instances  of  a  problem  (thus  making  the  problem 
seem  hard),  or  emphasize  the  easiest  instances  (making  the 
problem  seem  easily  solveable) .  Goldberg  chose  the  reasonable 
course  of  showing  that  the  satisf iablity  problem  is  solveable 
in  polynomial  expected  time  under  several  different  probability 
measures  on  the  problem.  So  a  meaningful  definition  of  EP 
awaits  further  insight  into  what  we  mean  by  a  natural  or  rea- 
sonable prooability  measure  on  a  problem. 

In  order  to  predict  some  of  the  properties  of  a  depth- 
first  search  on  traveling  salesman  problems,  we  need  a  way  of 
estimating  the  probability  function  for  the  arc  costs,  Q.  We 
have  found  empirically  that  Q  is  estimated  by  the  geometric 
function 


Qn(k)  =  (O.QQQ54n)(1+0^Q05l;n)k 


(8) 


where  n  is  the  size  of  the  class  of  problems.  Table  3  compares 
some  sample  mean  time  complexity  statistics  for  randomly  gen- 
erated traveling  salesman  problems  solved  using  a  depth-first 
search  with  estimates  generated  by  the  function  ET  introduced 
in  chapter  6.  The  randomly  generated  problems  were  given  an 
initial  bound  of  1000  (actually  1000  +  lower  bound  on  the  ini- 
tial feasible  set)  and  the  recurrence  relation  for  ET  was  com- 
puted  out  co  ETO000).   We  used  (3)  for  Q  and  our  formulas  (4) 
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and  (5)  for  P  in  computing  ET ,  in  the  column  marked  ETCI000). 
Note  that  ETCI000)  using  this  P  function  underestimates  the 
sample  mean.  According  to  our  investigation  of  the  best- 
bound-first  search  strategy,  we  need  an  upper  bound  on  P  and  a 
lower  bound  on  P(0)  in  order  to  get  an  estimate  which  bounds 
our  sample  mean  from  above.  While  F  computed  from  (4)  and  (5) 
is  a  good  estimate  for  the  mean  branching  factor  over  all  root 
nodes,  it  does  not  seem  to  be  a  good  enough  bound  on  the  aver- 
age branching  factor  at  other  depths  according  to  Table  2.  We 
obtain  good  upper  bounds  by  amending  P  as  follows:  Halve  P(2) 
and  distribute  the  difference  over  P(j),  P(4),  ...,  P(i_n/2J). 
We  retain  P(0)  =  e/n.  In  this  way  the  mean  of  P  has  been  in- 
creased and  the  lower  bound  on  P(0)  remains.  The  bounds  ob- 
tained using  this  P  function  in  ET  are  given  in  the  column  la- 
beled   ET'  (1000)    in    Table    3- 

Theorem  6.3  predicts  that  the  expected  size  of  the  search 
tree  in  a  depth-first-search  grows  essentially  linearly  as  a 
function  of  the  length  of  the  leftmost  path  in  the  search  tree. 
At  the  same  time  that  we  found  the  sample  mean  search  tree  size 
of  random  traveling  salesman  problems  above,  we  sampled  the 
search  tree  size  as  a  function  of  the  length  of  the  leftmost 
branch  of  the  tree.  This  data  is  presented  in  Table  4  and 
graphically  in  Figure  7.1.  The  data  in  Figure  7.3  clearly 
shows  the  linear  growth  of  the  mean  search  tree  size  for  as  far 
as    the    sample   means    are  meaningful. 
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TABLE  3.  Data  from  the  solution  of  randomly  generated  travel- 
ing salesman  problems  by  a  subtour-el imination  algorithm  using 
a  depth-first-search  strategy  and  given  an  initial  bound  of 
1000  (1000  above  and  beyond  the  lower  bound  on  the  root)  .  This 
data  is  compared  with  estimates  computed  from  our  model. 


Sample 

mean 

Sample 

Stack    1 

1   Si  ze 

No.  of 

search 

ET 

ET1 

mean 

depth    1 

1    of 

problems 

tree 

bound 

bound 

stack 

bound    1 

1  problem 

solved 

si  ze 

=1000 

=1000 

depth 

eq.  6-18  1 

1    10 

1000 

10.36 

11.06 

13.45 

2.98 

3.68    1 

1    15 

1000 

35.82 

30.03 

38.61 

4.83 

5.52    I 

1    20 

790 

81.85 

64.40 

88.72 

5.50 

7.36    I 

TABLE  4.  Data  from  randomly  generated  traveling  salesman  prob- 
lems giving  the  mean  time  complexity  as  a  function  of  the 
length  of  the  leftmost  path  in  the  search  tree. 


1   Si  ze 

1    of 

1  problem 

No.  of   1 

problems  1 

solved   1 

1    10 
1    15 
I    20 

1000    1 

1000    1 

780    I 

Mean  search  tree  size  when  the  leftmost 
branch  has  length  k 


k  =  0 

1 

2 

3 
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7 

8 

9 

10 

1 

5 

12 

20 

27 

36 

43 

51 

... 

1 

14 

35 

48 

72 

89 

99 

111 
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145 

•  •  • 

1 

17 

35 

85 

94 

128 
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299 

•  •  • 
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Figure  7.1.  The  data  from  Table  4  plotted,  showing  the  growth  of 
the  sample  mean  search  tree  size  as  a  function  of  the  length  of 
the  leftmost  branch  in  the  search  tree.  The  circles,  pluses,  and 
x's  represent  data  points  from  traveling  salesman  problems  of 
size  10,  15,  and  20  respectively.  The  problems  were  solved  by  a 
subtour-el imination  algorithm  using  a  depth-first  search  stra- 
tegy. 


Conclusions 
Chapter  8 

We  have  studied  a  model  of  branch  and  bound  algorithms 
and  derived  expressions  for  the  mean  space  and  time  require- 
ments for  several  search  strategies.  It  has  been  shown  that  in 
our  model  the  best-bound-first  search  strategy  is  optimal  in 
terms  of  our  measures  of  time  and  space  complexity.  The 
results  we  have  obtained  are  essentially  order  of  magnitude 
results  and  it  may  turn  out  in  practice  that  the  constants  as- 
sociated with  the  order  of  magnitude  for  a  given  algorithm  make 
a  difference  as  far  as  the  choice  of  search  strategy.  In  a 
best-bound-first  search  a  unit  of  storage  may  be  quite  large 
if,  for  example,  we  need  to  store  an  entire  matrix  as  in  an  in- 
teger linear  program,  since  enough  information  must  be  stored 
in  order  to  restart  the  search  from  each  unexplored  node.  On 
the  other  hand,  in  a  depth-first  search,  the  context  of  the 
search  is  stored  in  the  ancestors  of  a  node,  so  comparatively 
little  information  need  be  stored  per  node.  For  this  reason 
the  best-bound-first  search  strategy,  although  widely  recog- 
nized as  optimal  in  terms  of  time  complexity,  is  viewed  as  ex- 
cessively space-consuming.  Another  complaint  against  the 
best-bound-first  search  strategy  is  the  inefficiency  caused  by 
the  bookkeeping  involved.   But  there  are  efficient  data   struc- 
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tures  and  associated  routines  for  their  manipulation  available 
now  which  can  be  used  to  implement  this  strategy;  one,  men- 
tioned in  chapter  2,  being  the  priority  queue  [Aho ,  Hopcroft, 
and  Ullman  1974].  Of  the  roughly  45  minutes  of  CPU  on  an  IBM 
370/165  spent  in  producing  the  data  of  table  1,  less  than  6 
seconds  were  spent  maintaining  the  priority  queue.  Breadth- 
first  is  not  usually  a  practical  choice  of  search  strategy  for 
branch  and  bound  algorithms  because  it  has  the  disadvantages  of 
best-bound-first  and  depth-first  without  their  advantages. 
Breadth-first  is  like  best-bound-first  in  that  all  nodes  in 
memory  are  effectively  the  roots  of  different  search  trees  and 
for  each  node  all  information  neccesary  for  starting  up  the  as- 
sociated subproblem  must  be  stored.  This  means  that  breadth- 
first  search  has  a  large  constant  associated  with  its  space 
complexity.  On  the  other  hand  it  is  like  a  depth-first  search 
in  that  it  is  easy  to  construct  a  tree  for  which  a  best-bound 
search  explores  less  nodes  than  breadth-first  search.  So  it  is 
nonoptimal  in  terms  of  time  complexity.  A  breadth-first  search 
is  reasonable  however  when  it  is  known  or  suspected  that  the 
optimal  solution  is  found  at  a  shallow  depth. 

Our  model  is  particularly  suited  for  modelling  relaxation 
procedures,  where  there  is  some  chance  that  any  node  in  the 
search  tree  of  a  random  problem  from  a  class  may  produce  a 
feasible  solution.  The  success  of  the  assignment  problem  re- 
laxation for  solving  assymetric  traveling  salesman  problems  and 
Held  and  Karp's  1-tree  relaxation  for  solving  symmetric  travel- 
ing salesman  problems  suggests  that  the  search   for   polynomial 
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expected  time  algorithms  for  solving  hard  combinatorial  prob- 
lems might  begin  by  looking  for  suitable  relaxations  and  fast 
algorithms  for  solving  them.  The  search  for  fast  approximate 
algorithms  for  hard  combinatorial  problems  can  also  benefit 
from  the  use  of  relaxations  of  a  problem.  A  relaxed  solution 
to  a  problem  may  have  many  of  the  components  of  an  optimal 
feasible  solution.  A  heuristic  restructuring  of  the  relaxed 
solution  might  produce  a  feasible  solution  of  near  optimal 
cost . 


-89- 


APPENDIX 


Several  of  the  results  of  this  thesis  have  been  formulated 
as  somewhat  complex  recurrence  relations.  In  this  section  we 
show  how  two  of  these  recurrence  relations  can  be  broken  down 
into  simpler  relations  which  aid  in  the  computation  of  their  se- 
quences. 

In  chapter  3  the  function  0  was  introduced  in  the  form 

i  ~       co  i    s 

1-5  0(k)  =   5   P(j)*[l  -  I        I      Q(c)0(s-c)] :       (1) 
k=0       j=l  s=l  c=l 

with    boundary   condition   0(0)    =    P(0). 

Let 

s 
E(s)     =      5      Q(c)0(s-c)  , 
c=l 

is 
G(i)    =    1    -      2        I   Q(c)0(s-c) 
s=l    c=l 

i 
=    1    -      5    E(s)    =   G  (i-1)    -   E(i)  , 
s=l 

co 
B(i)    =      I   P(j)G(i)D, 
j  =  l 

0(i)    =  B(i-l)    -  B(i)  . 

i  ~ 
Note  that  B(i)  =1-5  0(k),  therefore  B(i-l)  -  B(i)  =  0(i).   In 

k  =  0 
terms   of   these  functions  the  computation  of  0  proceeds  as  given 

in  the  high  level  algorithm  of  figure  A.l.   For  some  Q  functions, 
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Figure   A.l.    An   algorithm   for   computing   0   on    the    range 
0, 1, 2, . . . , limit ,  given  the  probability  functions  P  and  Q. 


beg  in 
B(0) 
0(0) 
for  i 


=  1-P(0); 

=  P(0); 

=1  until  limit; 


beg  in 


E(i)  :=   i  Q(c)0(i-c)  ; 
c=l 

G(i)  :=  G(i-l)  -  E(i); 
B(i)  :=   i  P(])G(i)J; 
0(i)  :=  B(i-l)  -  B(i); 


end 


end 


E(i)  may  be  easily  expressed  as  a  recurrence  relation,  further 
simplifying  the  computation  of  0  (and  the  computation  of  ET  given 
below).   For  example  if  Q  is  geometric,  Q(c)  =  rsc ,  then, 

i  i 

E(i)  =   2  rsc*0(i-c)  =  1/s  1   rsc*0((i+l)  -  (c+1)) 
c-1  c=l 

i  +  1 
=  (1/s)  5  rsc*0(i+l  -  c)  =  (l/s)(E(i+l)  -  rsO(i)) 
c=2 

therefore , 


E  (i  +  1)  =  sE(i)  +  rsO( i)  . 
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The  recurrence  relation  for  ET(b)  introduced  in  chapter  6  can   be 
simplified  in  a  similar  manner.  ET(b)  has  the  form 


co      3 
ET(b)  =1+5  P(j)  5  Wd(b,i) 
j-1     i-1 


(2) 


where  Wd(b,i)  = 


co     _co 
5   . . .  i 


co 


CO 


i   ...   i    Q(c,  ) . . .Q(c- )0(m, ) . . .0(m.  ,  ) * 
c1=l    c.=l   mx=0    mi_i=0 


rET  (mintb/C-i^+m-j^ /c2+m2/ .  . .   'ci-i  +  1i-i^  ~ci^    ^ 


Essentially,  Wd(b,i)  has  the  form 

co  co 

Wd(b,i)  =  5  R(b,i,k)   5   Q (c . ) ET (k-c . ) 


k=l 


c.=l 


(4) 


where   R(b,i,k)   =    probability    that    k    =   min{b  ,  c-,  +1 ,  ,  .  .  . 
,c.  ,+1.  ,}.    (the  term  c.+l.  is  the  cost  of  the  least  cost  leaf 

in  the  j    subtree  below  the  root;  c.f.   Figure   3b).    In   other 

s  t 
words,   k   is   the  value  of  the  bound  immediately  after  the  i-1 

subtree  has  been  explored.   R(b,i,k)  may  be  formulated  easily   as 

follows:   We   have   2   cases,  either  k=b  or  k<b.   The  probability 

that  k=b  is 

R(b,i,b)  =  Pr  (c1+l1>b)  *Pr  (c2  +  l2>_b)  *.  .  .*Pr  (ci_1+li_1>_b) 


Ag  a  i  n  let 

E(s; 


i  Q(c)0(s-c) 
c=l 


k-1   s 
G(k)  =  1  -   2    1   Q(c)0(s-c) 
s=l  c=l 

k-1 
=  1  -   S  E (s)  =  G (k-1)  -  E (k-1) 
s=l 
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Here  G(k)  =  Pr(c+l>k)  and  E(k)  =  Pr(c+1  =  k) ,  so 


R(b,i,b)  =  G(b) 


i-1 


(5) 


The  other  case  we  need  to  consider  occurs  when  one  of  the 
subtrees  contains  a  least  cost  leaf  which  improves  the  initial 
bound  b.  The  probability  that  the  bound  has  the  value  m  is  the 
probability  that  one  of  the  subtrees  has  a  least  cost  leaf  of 
cost  m  and  the  rest  have  least  cost  leaves  of  cost  >^  m,  thus  not- 
icing that  each  of  the  i-1  subtrees  may  contain  the  least  cost 
leaf  we  have, 


R(b,i,m)  =  (i-1) *E (m) *G (m) 


i-2 


(6) 


Substituting  (5)  and  (6)  into  (4)  we  get 


b-1 


Wd(b,i)  =   5  (i-l)E  (k)G  (K)  1_2D(k)  +G(b)1~1D(b) 
k=l 


where  D(k)  =   2  Q(c)ET(k-c).   Further,  letting 
c=l 


b-1 


i-2, 


H(b,i)  =   2  (i-l)E  (k)G  (k)   ^D(k) 

k=l 


=  H(b-l)  +  (i-DE(b-l)G(b-l)  ^^(b-l) 


(7) 


we  have 


Wd(b,i)  =H(b,i)  +  G (b) 1-1D(b) . 


Looking  again  at  (2) ,   we   see   that   we   need   partial   sums   of 
Wd (b,i) ,  so  let 


W(bfi)  =   5  Wd(bfi)  =  W(b,i-1)  +  Wd(b,i) 
j=l 
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=  W(b,i-1)  +  H(b,i)  +  G(b)._1D(b) 


(8) 


Putting  all  these  pieces  together,  we  can  compute  ET  as  in  Figure 
A. 2.  The  infinities  which  appear  in  the  algorithms  of  figures 
A.l  and  A. 2  only  come  into  play  when  P  has  an  infinite  range, 
i.e.,  arbitrarily  large  branching  factors  are  possible.  In  most 
practical  classes  of  problems  the  branching  factor  is  in  fact 
bounded.  When  modeling  such  cases  the  infinities  are  replaced  by 
whatever  bound  exists  on  the  branching  factor.  In  an  implementa- 
tion of  this  algorithm,  the  arrays  E,  G,  and  D  can  be  replaced  by 
single  variables  since  only  the  most  recently  computed  value  of 
the  corresponding  array  is  ever  used.  Similarly  the  2-dimension- 
al  arrays  W  and  H  can  be  reduced  to  1-dimensional  arrays. 
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Figure  A. 2.   An  algorithm  for  computing  ET(b)   for   the   expected 
size  of  a  depth-first  search  tree  given  P,  Q,  and  0. 


beg  in 

ET(0)  :=  1; 

for  all  b,  W(b,0)  : 

for  all  b,  H(b,0)  : 

G(0)  :=  1; 

E  ( 0  )  :  =  0 ; 


=  0; 
=  0; 


for  b  :=  1  until  limit 
beg  in 

for    i  :  =1 ,  .  .  .  ,co 


i-2. 


H(b,i)     :=   H(b,i-1)    +    ( i-1 ) E (b-1 )G (b-1 )       "D(b-l); 


G(b)     :=   G(b-l)     -    E(b-l); 


E(b)     :=      i    Q(c)0(b-c)  ; 

c=l 
b 
D(b)     :=      i    Q(c)ET(b-c) ; 

c  =  l 


for    i  :  =1    until  oo; 


i-1, 


W(b,i)     :=   W(b,i-1)     +    H(b,i)     +   G (b)  D(b); 

CO 

ET(b)     :=1+      5      P ( j ) W ( b , j ) ; 
j-l 


end 


end 
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